date:20200127

[jira] [Commented] (LUCENE-9004) Approximate nearest vector search

2020-01-27 Thread Xin-Chun Zhang (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024929#comment-17024929
 ] 

Xin-Chun Zhang commented on LUCENE-9004:


Is there any possible to merge LUCENE-9136 with this issue?

> Approximate nearest vector search
> -
>
> Key: LUCENE-9004
> URL: https://issues.apache.org/jira/browse/LUCENE-9004
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Michael Sokolov
>Priority: Major
> Attachments: hnsw_layered_graph.png
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> "Semantic" search based on machine-learned vector "embeddings" representing 
> terms, queries and documents is becoming a must-have feature for a modern 
> search engine. SOLR-12890 is exploring various approaches to this, including 
> providing vector-based scoring functions. This is a spinoff issue from that.
> The idea here is to explore approximate nearest-neighbor search. Researchers 
> have found an approach based on navigating a graph that partially encodes the 
> nearest neighbor relation at multiple scales can provide accuracy > 95% (as 
> compared to exact nearest neighbor calculations) at a reasonable cost. This 
> issue will explore implementing HNSW (hierarchical navigable small-world) 
> graphs for the purpose of approximate nearest vector search (often referred 
> to as KNN or k-nearest-neighbor search).
> At a high level the way this algorithm works is this. First assume you have a 
> graph that has a partial encoding of the nearest neighbor relation, with some 
> short and some long-distance links. If this graph is built in the right way 
> (has the hierarchical navigable small world property), then you can 
> efficiently traverse it to find nearest neighbors (approximately) in log N 
> time where N is the number of nodes in the graph. I believe this idea was 
> pioneered in  [1]. The great insight in that paper is that if you use the 
> graph search algorithm to find the K nearest neighbors of a new document 
> while indexing, and then link those neighbors (undirectedly, ie both ways) to 
> the new document, then the graph that emerges will have the desired 
> properties.
> The implementation I propose for Lucene is as follows. We need two new data 
> structures to encode the vectors and the graph. We can encode vectors using a 
> light wrapper around {{BinaryDocValues}} (we also want to encode the vector 
> dimension and have efficient conversion from bytes to floats). For the graph 
> we can use {{SortedNumericDocValues}} where the values we encode are the 
> docids of the related documents. Encoding the interdocument relations using 
> docids directly will make it relatively fast to traverse the graph since we 
> won't need to lookup through an id-field indirection. This choice limits us 
> to building a graph-per-segment since it would be impractical to maintain a 
> global graph for the whole index in the face of segment merges. However 
> graph-per-segment is a very natural at search time - we can traverse each 
> segments' graph independently and merge results as we do today for term-based 
> search.
> At index time, however, merging graphs is somewhat challenging. While 
> indexing we build a graph incrementally, performing searches to construct 
> links among neighbors. When merging segments we must construct a new graph 
> containing elements of all the merged segments. Ideally we would somehow 
> preserve the work done when building the initial graphs, but at least as a 
> start I'd propose we construct a new graph from scratch when merging. The 
> process is going to be  limited, at least initially, to graphs that can fit 
> in RAM since we require random access to the entire graph while constructing 
> it: In order to add links bidirectionally we must continually update existing 
> documents.
> I think we want to express this API to users as a single joint 
> {{KnnGraphField}} abstraction that joins together the vectors and the graph 
> as a single joint field type. Mostly it just looks like a vector-valued 
> field, but has this graph attached to it.
> I'll push a branch with my POC and would love to hear comments. It has many 
> nocommits, basic design is not really set, there is no Query implementation 
> and no integration iwth IndexSearcher, but it does work by some measure using 
> a standalone test class. I've tested with uniform random vectors and on my 
> laptop indexed 10K documents in around 10 seconds and searched them at 95% 
> recall (compared with exact nearest-neighbor baseline) at around 250 QPS. I 
> haven't made any attempt to use multithreaded search for this, but it is 
> amenable to per-segment concurrency.
> [1] 
>

[jira] [Created] (SOLR-14224) Not able to build solr 6.6.2 from source after January 15, 2020

2020-01-27 Thread Guruprasad K K (Jira)

Guruprasad K K created SOLR-14224:
-

 Summary: Not able to build solr 6.6.2 from source after January 
15, 2020
 Key: SOLR-14224
 URL: https://issues.apache.org/jira/browse/SOLR-14224
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
Affects Versions: 6.6.2
Reporter: Guruprasad K K


After Jan 15th maven is allowing only https connections to repo. But solr 6.6.2 
version uses http connection. So our builds are failing.

But looks like latest version of solr has the fix to this in common_build.xml 
and other places where it uses https connection to maven.

What is the work around for this if we cant upgrade the solr version and still 
if we want to use 6.6.2?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-12325) introduce uniqueBlockQuery(parent:true) aggregation for JSON Facet

2020-01-27 Thread Munendra S N (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-12325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024847#comment-17024847
 ] 

Munendra S N commented on SOLR-12325:
-

Apologies Mikhail, I was caught in some other thing.
+1 to suggested approach

> introduce uniqueBlockQuery(parent:true) aggregation for JSON Facet
> --
>
> Key: SOLR-12325
> URL: https://issues.apache.org/jira/browse/SOLR-12325
> Project: Solr
>  Issue Type: New Feature
>  Components: Facet Module
>Reporter: Mikhail Khludnev
>Assignee: Mikhail Khludnev
>Priority: Major
> Fix For: 8.5
>
> Attachments: SOLR-12325.patch, SOLR-12325.patch, SOLR-12325.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> It might be faster twin for {{uniqueBlock(\_root_)}}. Please utilise buildin 
> query parsing method, don't invent your own. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] ErickErickson opened a new pull request #1218: Javacc erick

2020-01-27 Thread GitBox

ErickErickson opened a new pull request #1218: Javacc erick
URL: https://github.com/apache/lucene-solr/pull/1218
 
 
   Here's the build changes to get javacc to run, modeled on the jflex changes 
, many thanks for the model. Only two files changed here ;)
   
   If the structure is OK, I'll fill in the "doLast" blocks with the cleanup 
code and maybe be able extract some common parts. NOTE: you can't even compile 
the result of running this because I wanted the changes to the build structure 
to be clear first so didn't include the cleanup tasks yet.
   
   So if this structure is OK, should I merge it into master before or after 
the rest of the cleanup? My assumption is after. I want to try to get all the 
warnings etc. out of the generated code in the next phase to reduce the 
temptation for people to make hand-edits.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] madrob commented on issue #1191: SOLR-14197 Reduce API of SolrResourceLoader

2020-01-27 Thread GitBox

madrob commented on issue #1191: SOLR-14197 Reduce API of SolrResourceLoader
URL: https://github.com/apache/lucene-solr/pull/1191#issuecomment-579026541
 
 
   This looks pretty nice and was something I had been thinking about as well. 
Skimmed the first handful of commits and things made sense, I'll try to take a 
deeper look at this tomorrow or Wednesday!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] madrob commented on issue #1205: SOLR-14206: Annotate HttpSolrCall as thread-safe

2020-01-27 Thread GitBox

madrob commented on issue #1205: SOLR-14206: Annotate HttpSolrCall as 
thread-safe
URL: https://github.com/apache/lucene-solr/pull/1205#issuecomment-579025145
 
 
   I think you already did this in #1203 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] madrob opened a new pull request #1217: SOLR-14223 PublicKeyHandler consumes a lot of entropy during tests

2020-01-27 Thread GitBox

madrob opened a new pull request #1217: SOLR-14223 PublicKeyHandler consumes a 
lot of entropy during tests
URL: https://github.com/apache/lucene-solr/pull/1217
 
 
   Use a non-blocking implementation of SecureRandom for generating RSA Keys.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14223) PublicKeyHandler consumes a lot of entropy during tests

2020-01-27 Thread Mike Drob (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024777#comment-17024777
 ] 

Mike Drob commented on SOLR-14223:
--

cc: [~noble.paul] [~varun] - Interested in your thoughts since you were active 
on the original issue.

> PublicKeyHandler consumes a lot of entropy during tests
> ---
>
> Key: SOLR-14223
> URL: https://issues.apache.org/jira/browse/SOLR-14223
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.4, 8.0
>Reporter: Mike Drob
>Priority: Major
>
> After the changes in SOLR-12354 to eagerly create a {{PublicKeyHandler}} for 
> the CoreContainer, the creation of the underlying {{RSAKeyPair}} uses 
> {{SecureRandom}} to generate primes. This eats up a lot of system entropy and 
> can slow down tests significantly (I observed it adding 10s to an individual 
> test).
> Similar to what we do for SSL config for tests, we can swap in a non blocking 
> implementation of SecureRandom for the key pair generation to allow multiple 
> tests to run better in parallel. Primality testing with BigInteger is also 
> slow, so I'm not sure how much total speedup we can get here, maybe it's 
> worth checking if there are faster implementations out there in other 
> libraries.
> In production cases, this also blocks creation of all cores. We should only 
> create the Handler if necessary, i.e. if the existing authn/z tell us that 
> they won't support internode requests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (SOLR-14223) PublicKeyHandler consumes a lot of entropy during tests

2020-01-27 Thread Mike Drob (Jira)

Mike Drob created SOLR-14223:


 Summary: PublicKeyHandler consumes a lot of entropy during tests
 Key: SOLR-14223
 URL: https://issues.apache.org/jira/browse/SOLR-14223
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
Affects Versions: 8.0, 7.4
Reporter: Mike Drob


After the changes in SOLR-12354 to eagerly create a {{PublicKeyHandler}} for 
the CoreContainer, the creation of the underlying {{RSAKeyPair}} uses 
{{SecureRandom}} to generate primes. This eats up a lot of system entropy and 
can slow down tests significantly (I observed it adding 10s to an individual 
test).

Similar to what we do for SSL config for tests, we can swap in a non blocking 
implementation of SecureRandom for the key pair generation to allow multiple 
tests to run better in parallel. Primality testing with BigInteger is also 
slow, so I'm not sure how much total speedup we can get here, maybe it's worth 
checking if there are faster implementations out there in other libraries.

In production cases, this also blocks creation of all cores. We should only 
create the Handler if necessary, i.e. if the existing authn/z tell us that they 
won't support internode requests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] tflobbe commented on a change in pull request #1210: SOLR-14219 force serialVersionUID of OverseerSolrResponse

2020-01-27 Thread GitBox

tflobbe commented on a change in pull request #1210: SOLR-14219 force 
serialVersionUID of OverseerSolrResponse
URL: https://github.com/apache/lucene-solr/pull/1210#discussion_r371507235
 
 

 ##
 File path: solr/core/src/java/org/apache/solr/cloud/OverseerSolrResponse.java
 ##
 @@ -26,7 +26,9 @@
 import java.util.Objects;
 
 public class OverseerSolrResponse extends SolrResponse {
-  
+ 
+  private static final long serialVersionUID = 4721653044098960880L;
 
 Review comment:
   I agree, everything that uses java serialization should be setting a 
serialVersionUID, my concern is that it may be too late now. I think you 
discovered a bug with your test (thanks!), but I believe it's too late to add a 
serialVersionUID now because for some users it could mean exactly the same as 
if we'd had one before and now we are changing it.
   Hopefully this won't be an issue once the serialization is in javabin and 
the Java serialization part is removed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14222) CloudSolrClient converts (update) 403 error to 500 error

2020-01-27 Thread Chris M. Hostetter (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris M. Hostetter updated SOLR-14222:
--
Attachment: SOLR-14222_test.patch
Status: Open  (was: Open)

attaching SOLR-14222_test.patch which shows the problem.

> CloudSolrClient converts (update) 403 error to 500 error 
> -
>
> Key: SOLR-14222
> URL: https://issues.apache.org/jira/browse/SOLR-14222
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud, SolrJ
>Reporter: Chris M. Hostetter
>Priority: Major
> Attachments: SOLR-14222_test.patch
>
>
> Something about the way CloudSolrClient pulls UpdateRequetss apart to send 
> docs direct to leaders also seems to cause it to report status code "500" 
> Server Errors when 403 authorization errors are thrown by the server.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (SOLR-14222) CloudSolrClient converts (update) 403 error to 500 error

2020-01-27 Thread Chris M. Hostetter (Jira)

Chris M. Hostetter created SOLR-14222:
-

 Summary: CloudSolrClient converts (update) 403 error to 500 error 
 Key: SOLR-14222
 URL: https://issues.apache.org/jira/browse/SOLR-14222
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: SolrCloud, SolrJ
Reporter: Chris M. Hostetter


Something about the way CloudSolrClient pulls UpdateRequetss apart to send docs 
direct to leaders also seems to cause it to report status code "500" Server 
Errors when 403 authorization errors are thrown by the server.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] dnhatn commented on issue #1215: LUCENE-9164: Ignore ACE on tragic event if IW is closed

2020-01-27 Thread GitBox

dnhatn commented on issue #1215: LUCENE-9164: Ignore ACE on tragic event if IW 
is closed
URL: https://github.com/apache/lucene-solr/pull/1215#issuecomment-578960010
 
 
   @mikemccand @jpountz Would you mind taking a look? Thank you.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14040) solr.xml shareSchema does not work in SolrCloud

2020-01-27 Thread Chris M. Hostetter (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024672#comment-17024672
 ] 

Chris M. Hostetter commented on SOLR-14040:
---

FWIW: TestBulkSchemaConcurrent was failing a lot on master as well after your 
original commit, but the master failures seemed to have dropped off after your 
Jan22 commits while the 8x failures continued.

I have not dug into the logs from the failures to compare 8x / master (or 8x 
bbefore/after your "restore legacy Collection auto-creation" commits) to see if 
the *nature* of the failures is diff – but you might want to before they get 
purged (my report system only keeps the past 7 days worth of logs due to disk 
constraints)

 

 

> solr.xml shareSchema does not work in SolrCloud
> ---
>
> Key: SOLR-14040
> URL: https://issues.apache.org/jira/browse/SOLR-14040
> Project: Solr
>  Issue Type: Improvement
>  Components: Schema and Analysis
>Reporter: David Smiley
>Assignee: David Smiley
>Priority: Major
> Fix For: 8.5
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> solr.xml has a shareSchema boolean option that can be toggled from the 
> default of false to true in order to share IndexSchema objects within the 
> Solr node.  This is silently ignored in SolrCloud mode.  The pertinent code 
> is {{org.apache.solr.core.ConfigSetService#createConfigSetService}} which 
> creates a CloudConfigSetService that is not related to the SchemaCaching 
> class.  This may not be a big deal in SolrCloud which tends not to deal well 
> with many cores per node but I'm working on changing that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14040) solr.xml shareSchema does not work in SolrCloud

2020-01-27 Thread David Smiley (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024668#comment-17024668
 ] 

David Smiley commented on SOLR-14040:
-

It appears that problem was recently fixed in SOLR-14211.  Notice that fix was 
in master for awhile and only 13 hours ago was it back-ported to 8x.

> solr.xml shareSchema does not work in SolrCloud
> ---
>
> Key: SOLR-14040
> URL: https://issues.apache.org/jira/browse/SOLR-14040
> Project: Solr
>  Issue Type: Improvement
>  Components: Schema and Analysis
>Reporter: David Smiley
>Assignee: David Smiley
>Priority: Major
> Fix For: 8.5
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> solr.xml has a shareSchema boolean option that can be toggled from the 
> default of false to true in order to share IndexSchema objects within the 
> Solr node.  This is silently ignored in SolrCloud mode.  The pertinent code 
> is {{org.apache.solr.core.ConfigSetService#createConfigSetService}} which 
> creates a CloudConfigSetService that is not related to the SchemaCaching 
> class.  This may not be a big deal in SolrCloud which tends not to deal well 
> with many cores per node but I'm working on changing that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14040) solr.xml shareSchema does not work in SolrCloud

2020-01-27 Thread David Smiley (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024652#comment-17024652
 ] 

David Smiley commented on SOLR-14040:
-

I have not; I didn't make the connection.  Hmmm, it's interesting that only 8x 
has failed and not master.  I checked that the changes happened on both 
branches for both commits.  Hmmm, looking more...

> solr.xml shareSchema does not work in SolrCloud
> ---
>
> Key: SOLR-14040
> URL: https://issues.apache.org/jira/browse/SOLR-14040
> Project: Solr
>  Issue Type: Improvement
>  Components: Schema and Analysis
>Reporter: David Smiley
>Assignee: David Smiley
>Priority: Major
> Fix For: 8.5
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> solr.xml has a shareSchema boolean option that can be toggled from the 
> default of false to true in order to share IndexSchema objects within the 
> Solr node.  This is silently ignored in SolrCloud mode.  The pertinent code 
> is {{org.apache.solr.core.ConfigSetService#createConfigSetService}} which 
> creates a CloudConfigSetService that is not related to the SchemaCaching 
> class.  This may not be a big deal in SolrCloud which tends not to deal well 
> with many cores per node but I'm working on changing that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-12325) introduce uniqueBlockQuery(parent:true) aggregation for JSON Facet

2020-01-27 Thread Mikhail Khludnev (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-12325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024637#comment-17024637
 ] 

Mikhail Khludnev commented on SOLR-12325:
-

No concerns so far. I'm going to revamp syntax as follows:

||Syntax||Behavior||
|uniqueBlock(field)|as-is field logic|
|uniqueBlock($fieldparam)..=field|as-is field reference logic|
|uniqueBlock(\{!v=type_s:pipe\})|new query logic|
|uniqueBlock(\{!v=$qref\})...=K:amber some|new query referencing logic|

Looking forward for your opinion.


> introduce uniqueBlockQuery(parent:true) aggregation for JSON Facet
> --
>
> Key: SOLR-12325
> URL: https://issues.apache.org/jira/browse/SOLR-12325
> Project: Solr
>  Issue Type: New Feature
>  Components: Facet Module
>Reporter: Mikhail Khludnev
>Assignee: Mikhail Khludnev
>Priority: Major
> Fix For: 8.5
>
> Attachments: SOLR-12325.patch, SOLR-12325.patch, SOLR-12325.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> It might be faster twin for {{uniqueBlock(\_root_)}}. Please utilise buildin 
> query parsing method, don't invent your own. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-11207) Add OWASP dependency checker to detect security vulnerabilities in third party libraries

2020-01-27 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-11207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024624#comment-17024624
 ] 

ASF subversion and git services commented on SOLR-11207:


Commit 53f7b394e49e9b6d5f3e3aa6980078421d87688e in lucene-solr's branch 
refs/heads/master from Jan Høydahl
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=53f7b39 ]

SOLR-11207: Mute warnings for owasp false positives


> Add OWASP dependency checker to detect security vulnerabilities in third 
> party libraries
> 
>
> Key: SOLR-11207
> URL: https://issues.apache.org/jira/browse/SOLR-11207
> Project: Solr
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 6.0
>Reporter: Hrishikesh Gadre
>Assignee: Jan Høydahl
>Priority: Major
> Fix For: master (9.0)
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> Lucene/Solr project depends on number of third party libraries. Some of those 
> libraries contain security vulnerabilities. Upgrading to versions of those 
> libraries that have fixes for those vulnerabilities is a simple, critical 
> step we can take to improve the security of the system. But for that we need 
> a tool which can scan the Lucene/Solr dependencies and look up the security 
> database for known vulnerabilities.
> I found that [OWASP 
> dependency-checker|https://jeremylong.github.io/DependencyCheck/dependency-check-ant/]
>  can be used for this purpose. It provides a ant task which we can include in 
> the Lucene/Solr build. We also need to figure out how (and when) to invoke 
> this dependency-checker. But this can be figured out once we complete the 
> first step of integrating this tool with the Lucene/Solr build system.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9184) Add relaxed mode for "checkWorkingCopyClean" in Gradle build (similar to Ant)

2020-01-27 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024612#comment-17024612
 ] 

ASF subversion and git services commented on LUCENE-9184:
-

Commit ff635cf701f086241117c5dab925aa5ef825ce51 in lucene-solr's branch 
refs/heads/master from Dawid Weiss
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=ff635cf ]

LUCENE-9184, LUCENE-9183: allow skipping git status check in precommit with 
-Pvalidation.git.failOnModified=false (or place this in gradle.properties to 
make it permanent).


> Add relaxed mode for "checkWorkingCopyClean" in Gradle build (similar to Ant)
> -
>
> Key: LUCENE-9184
> URL: https://issues.apache.org/jira/browse/LUCENE-9184
> Project: Lucene - Core
>  Issue Type: Wish
>  Components: general/build
>Reporter: Uwe Schindler
>Assignee: Dawid Weiss
>Priority: Major
> Fix For: master (9.0)
>
>
> Depending on the type of Git Client you are using (I hate the command line, I 
> use Eclipse Git or TortoiseGit -- my preference), the way how files are 
> committed differs. Normally with git command line you would first stage all 
> files and then commit them. If you stage them and then run precommit, it 
> works fine, as the "changed" and "added" and other stati are ignored and its 
> still confirmed as "clean". After the pre-COMMIT task you finally commit.
> But Git GUIs don't have the concept of staging. You can (similar to 
> Subversion) add files and delete files, but when you modify a file you cannot 
> explicitely "stage" the change. What you do is to open the commit GUI, put 
> checkboxes on all files you want to commit and then the GUI triggers a stage 
> and commit directly after each other.
> In this workflow, the precommit check of course complains about "modified" 
> files.
> This is the reason why the Ant task does have 2 modes:
> - The strict mode which forbids any change in the working copy, so it must be 
> 100% clean. By default, Ant only runs this if the property "is.jenkins.build" 
> is enabled. The reason for that is to detect any change in the working copy 
> caused by running the Jenkins CI (like temporary files munging around).
> - The default "committer/developer" mode: In this case the working copy check 
> only complains about "untracked" or "missing" files. So a committer who 
> changes some files can still pass precommit. If he adds a new file, he has to 
> add it to the index, so its not untracked. But generally normal modifications 
> of working copy are allowed.
> Please add this back. There was a reason why I set up the check-working-copy 
> Ant task like it was.
> If others aggree, i'd like to change the task so it has two modes:
> - Full clean mode (for CI builds), enabled only if it's a CI build -- we 
> should maybe add some tasks like "jenkins-hourly"on root project that enables 
> this mode
> - Developer mode (default), that does not care about "modified" files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-9183) Allow optional skipping of git status check in precommit

2020-01-27 Thread Dawid Weiss (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss resolved LUCENE-9183.
-

> Allow optional skipping of git status check in precommit
> 
>
> Key: LUCENE-9183
> URL: https://issues.apache.org/jira/browse/LUCENE-9183
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Trivial
>
> Had an offline conversation with Uwe about it. For people who don't use git 
> staging
> (only IDEs) the precommit may be problematic as it currently fails on locally 
> changed 
> files. 
> I'll add an option to skip it, if the developer so desires.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9183) Allow optional skipping of git status check in precommit

2020-01-27 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024613#comment-17024613
 ] 

ASF subversion and git services commented on LUCENE-9183:
-

Commit ff635cf701f086241117c5dab925aa5ef825ce51 in lucene-solr's branch 
refs/heads/master from Dawid Weiss
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=ff635cf ]

LUCENE-9184, LUCENE-9183: allow skipping git status check in precommit with 
-Pvalidation.git.failOnModified=false (or place this in gradle.properties to 
make it permanent).


> Allow optional skipping of git status check in precommit
> 
>
> Key: LUCENE-9183
> URL: https://issues.apache.org/jira/browse/LUCENE-9183
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Trivial
>
> Had an offline conversation with Uwe about it. For people who don't use git 
> staging
> (only IDEs) the precommit may be problematic as it currently fails on locally 
> changed 
> files. 
> I'll add an option to skip it, if the developer so desires.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-9184) Add relaxed mode for "checkWorkingCopyClean" in Gradle build (similar to Ant)

2020-01-27 Thread Dawid Weiss (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss resolved LUCENE-9184.
-
Resolution: Fixed

> Add relaxed mode for "checkWorkingCopyClean" in Gradle build (similar to Ant)
> -
>
> Key: LUCENE-9184
> URL: https://issues.apache.org/jira/browse/LUCENE-9184
> Project: Lucene - Core
>  Issue Type: Wish
>  Components: general/build
>Reporter: Uwe Schindler
>Assignee: Dawid Weiss
>Priority: Major
> Fix For: master (9.0)
>
>
> Depending on the type of Git Client you are using (I hate the command line, I 
> use Eclipse Git or TortoiseGit -- my preference), the way how files are 
> committed differs. Normally with git command line you would first stage all 
> files and then commit them. If you stage them and then run precommit, it 
> works fine, as the "changed" and "added" and other stati are ignored and its 
> still confirmed as "clean". After the pre-COMMIT task you finally commit.
> But Git GUIs don't have the concept of staging. You can (similar to 
> Subversion) add files and delete files, but when you modify a file you cannot 
> explicitely "stage" the change. What you do is to open the commit GUI, put 
> checkboxes on all files you want to commit and then the GUI triggers a stage 
> and commit directly after each other.
> In this workflow, the precommit check of course complains about "modified" 
> files.
> This is the reason why the Ant task does have 2 modes:
> - The strict mode which forbids any change in the working copy, so it must be 
> 100% clean. By default, Ant only runs this if the property "is.jenkins.build" 
> is enabled. The reason for that is to detect any change in the working copy 
> caused by running the Jenkins CI (like temporary files munging around).
> - The default "committer/developer" mode: In this case the working copy check 
> only complains about "untracked" or "missing" files. So a committer who 
> changes some files can still pass precommit. If he adds a new file, he has to 
> add it to the index, so its not untracked. But generally normal modifications 
> of working copy are allowed.
> Please add this back. There was a reason why I set up the check-working-copy 
> Ant task like it was.
> If others aggree, i'd like to change the task so it has two modes:
> - Full clean mode (for CI builds), enabled only if it's a CI build -- we 
> should maybe add some tasks like "jenkins-hourly"on root project that enables 
> this mode
> - Developer mode (default), that does not care about "modified" files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (LUCENE-9184) Add relaxed mode for "checkWorkingCopyClean" in Gradle build (similar to Ant)

2020-01-27 Thread Uwe Schindler (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-9184:
--
Description: 
Depending on the type of Git Client you are using (I hate the command line, I 
use Eclipse Git or TortoiseGit -- my preference), the way how files are 
committed differs. Normally with git command line you would first stage all 
files and then commit them. If you stage them and then run precommit, it works 
fine, as the "changed" and "added" and other stati are ignored and its still 
confirmed as "clean". After the pre-COMMIT task you finally commit.

But Git GUIs don't have the concept of staging. You can (similar to Subversion) 
add files and delete files, but when you modify a file you cannot explicitely 
"stage" the change. What you do is to open the commit GUI, put checkboxes on 
all files you want to commit and then the GUI triggers a stage and commit 
directly after each other.

In this workflow, the precommit check of course complains about "modified" 
files.

This is the reason why the Ant task does have 2 modes:
- The strict mode which forbids any change in the working copy, so it must be 
100% clean. By default, Ant only runs this if the property "is.jenkins.build" 
is enabled. The reason for that is to detect any change in the working copy 
caused by running the Jenkins CI (like temporary files munging around).
- The default "committer/developer" mode: In this case the working copy check 
only complains about "untracked" or "missing" files. So a committer who changes 
some files can still pass precommit. If he adds a new file, he has to add it to 
the index, so its not untracked. But generally normal modifications of working 
copy are allowed.

Please add this back. There was a reason why I set up the check-working-copy 
Ant task like it was.

If others aggree, i'd like to change the task so it has two modes:
- Full clean mode (for CI builds), enabled only if it's a CI build -- we should 
maybe add some tasks like "jenkins-hourly"on root project that enables this mode
- Developer mode (default), that does not care about "modified" files.

  was:
Depending on the type of Git Client you are using (I hate the command line, I 
use Eclipse Git or TortoiseGit -- my preference), the way how files are 
committed differs. Normally with git command line you would first stage all 
files and then commit them. If you stage them and then run precommit, it works 
fine, as the "changed" and "added" and other stati are ignored and its still 
confirmed as "clean". After the pre-COMMIT task you finally commit.

But Git GUIs don't have the concept of staging. You can (similar to Subversion) 
add files and delete files, but when you modify a file you cannot explicitely 
"stage" the change. What you do is to open the commit GUI, put checkboxes on 
all files you want to commit and then the GUI triggers a stage and commit 
directly after each other.

In this workflow, the precommit check of course complains about "modified" 
files.

This is the reason why the Ant task does have 2 modes:
- The strict mode which forbids any change in the working copy, so it must be 
100% clean. By default, Ant only runs this if the property "is.jenkins.build" 
is enabled. The reason for that is to detect any change in the working copy 
caused by running the Jenkins CI (like temporary files munging around).
- The default "committer/developer" mode: In this case the working copy check 
only complains about "untracked" or "missing" files. So a committer who changes 
some files can still pass precommit. If he adds a new file, he has to add it to 
the index, so its not untracked. But generally normal modifications of working 
copy are allowed.

Please add this back. There was a reason why I set up the check-working-copy 
Ant tak like it was.

If others aggree, i'd like to change the task so it has two modes:
- Full clean mode (for CI builds), enabled only if it's a CI build -- we should 
maybe add some tasks like "jenkins-hourly"on root project that enables this mode
- Developer mode (default), that does not care about "modified" files.


> Add relaxed mode for "checkWorkingCopyClean" in Gradle build (similar to Ant)
> -
>
> Key: LUCENE-9184
> URL: https://issues.apache.org/jira/browse/LUCENE-9184
> Project: Lucene - Core
>  Issue Type: Wish
>  Components: general/build
>Reporter: Uwe Schindler
>Assignee: Dawid Weiss
>Priority: Major
> Fix For: master (9.0)
>
>
> Depending on the type of Git Client you are using (I hate the command line, I 
> use Eclipse Git or TortoiseGit -- my preference), the way how files are 
> committed differs. Normally with git command line you would first stage all 
> files and then commit them. If you stage them and

[jira] [Reopened] (LUCENE-9184) Add relaxed mode for "checkWorkingCopyClean" in Gradle build (similar to Ant)

2020-01-27 Thread Uwe Schindler (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler reopened LUCENE-9184:
---
  Assignee: Dawid Weiss

Lol, we both closed the linked issues.

> Add relaxed mode for "checkWorkingCopyClean" in Gradle build (similar to Ant)
> -
>
> Key: LUCENE-9184
> URL: https://issues.apache.org/jira/browse/LUCENE-9184
> Project: Lucene - Core
>  Issue Type: Wish
>  Components: general/build
>Reporter: Uwe Schindler
>Assignee: Dawid Weiss
>Priority: Major
> Fix For: master (9.0)
>
>
> Depending on the type of Git Client you are using (I hate the command line, I 
> use Eclipse Git or TortoiseGit -- my preference), the way how files are 
> committed differs. Normally with git command line you would first stage all 
> files and then commit them. If you stage them and then run precommit, it 
> works fine, as the "changed" and "added" and other stati are ignored and its 
> still confirmed as "clean". After the pre-COMMIT task you finally commit.
> But Git GUIs don't have the concept of staging. You can (similar to 
> Subversion) add files and delete files, but when you modify a file you cannot 
> explicitely "stage" the change. What you do is to open the commit GUI, put 
> checkboxes on all files you want to commit and then the GUI triggers a stage 
> and commit directly after each other.
> In this workflow, the precommit check of course complains about "modified" 
> files.
> This is the reason why the Ant task does have 2 modes:
> - The strict mode which forbids any change in the working copy, so it must be 
> 100% clean. By default, Ant only runs this if the property "is.jenkins.build" 
> is enabled. The reason for that is to detect any change in the working copy 
> caused by running the Jenkins CI (like temporary files munging around).
> - The default "committer/developer" mode: In this case the working copy check 
> only complains about "untracked" or "missing" files. So a committer who 
> changes some files can still pass precommit. If he adds a new file, he has to 
> add it to the index, so its not untracked. But generally normal modifications 
> of working copy are allowed.
> Please add this back. There was a reason why I set up the check-working-copy 
> Ant tak like it was.
> If others aggree, i'd like to change the task so it has two modes:
> - Full clean mode (for CI builds), enabled only if it's a CI build -- we 
> should maybe add some tasks like "jenkins-hourly"on root project that enables 
> this mode
> - Developer mode (default), that does not care about "modified" files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (LUCENE-9183) Allow optional skipping of git status check in precommit

2020-01-27 Thread Dawid Weiss (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss updated LUCENE-9183:

Status: Reopened  (was: Closed)

> Allow optional skipping of git status check in precommit
> 
>
> Key: LUCENE-9183
> URL: https://issues.apache.org/jira/browse/LUCENE-9183
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Trivial
>
> Had an offline conversation with Uwe about it. For people who don't use git 
> staging
> (only IDEs) the precommit may be problematic as it currently fails on locally 
> changed 
> files. 
> I'll add an option to skip it, if the developer so desires.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-9184) Add relaxed mode for "checkWorkingCopyClean" in Gradle build (similar to Ant)

2020-01-27 Thread Dawid Weiss (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss resolved LUCENE-9184.
-
  Assignee: (was: Dawid Weiss)
Resolution: Duplicate

Duplicate of LUCENE-9183

> Add relaxed mode for "checkWorkingCopyClean" in Gradle build (similar to Ant)
> -
>
> Key: LUCENE-9184
> URL: https://issues.apache.org/jira/browse/LUCENE-9184
> Project: Lucene - Core
>  Issue Type: Wish
>  Components: general/build
>Reporter: Uwe Schindler
>Priority: Major
> Fix For: master (9.0)
>
>
> Depending on the type of Git Client you are using (I hate the command line, I 
> use Eclipse Git or TortoiseGit -- my preference), the way how files are 
> committed differs. Normally with git command line you would first stage all 
> files and then commit them. If you stage them and then run precommit, it 
> works fine, as the "changed" and "added" and other stati are ignored and its 
> still confirmed as "clean". After the pre-COMMIT task you finally commit.
> But Git GUIs don't have the concept of staging. You can (similar to 
> Subversion) add files and delete files, but when you modify a file you cannot 
> explicitely "stage" the change. What you do is to open the commit GUI, put 
> checkboxes on all files you want to commit and then the GUI triggers a stage 
> and commit directly after each other.
> In this workflow, the precommit check of course complains about "modified" 
> files.
> This is the reason why the Ant task does have 2 modes:
> - The strict mode which forbids any change in the working copy, so it must be 
> 100% clean. By default, Ant only runs this if the property "is.jenkins.build" 
> is enabled. The reason for that is to detect any change in the working copy 
> caused by running the Jenkins CI (like temporary files munging around).
> - The default "committer/developer" mode: In this case the working copy check 
> only complains about "untracked" or "missing" files. So a committer who 
> changes some files can still pass precommit. If he adds a new file, he has to 
> add it to the index, so its not untracked. But generally normal modifications 
> of working copy are allowed.
> Please add this back. There was a reason why I set up the check-working-copy 
> Ant tak like it was.
> If others aggree, i'd like to change the task so it has two modes:
> - Full clean mode (for CI builds), enabled only if it's a CI build -- we 
> should maybe add some tasks like "jenkins-hourly"on root project that enables 
> this mode
> - Developer mode (default), that does not care about "modified" files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Assigned] (LUCENE-9184) Add relaxed mode for "checkWorkingCopyClean" in Gradle build (similar to Ant)

2020-01-27 Thread Uwe Schindler (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler reassigned LUCENE-9184:
-

Assignee: Dawid Weiss  (was: Uwe Schindler)

> Add relaxed mode for "checkWorkingCopyClean" in Gradle build (similar to Ant)
> -
>
> Key: LUCENE-9184
> URL: https://issues.apache.org/jira/browse/LUCENE-9184
> Project: Lucene - Core
>  Issue Type: Wish
>  Components: general/build
>Reporter: Uwe Schindler
>Assignee: Dawid Weiss
>Priority: Major
> Fix For: master (9.0)
>
>
> Depending on the type of Git Client you are using (I hate the command line, I 
> use Eclipse Git or TortoiseGit -- my preference), the way how files are 
> committed differs. Normally with git command line you would first stage all 
> files and then commit them. If you stage them and then run precommit, it 
> works fine, as the "changed" and "added" and other stati are ignored and its 
> still confirmed as "clean". After the pre-COMMIT task you finally commit.
> But Git GUIs don't have the concept of staging. You can (similar to 
> Subversion) add files and delete files, but when you modify a file you cannot 
> explicitely "stage" the change. What you do is to open the commit GUI, put 
> checkboxes on all files you want to commit and then the GUI triggers a stage 
> and commit directly after each other.
> In this workflow, the precommit check of course complains about "modified" 
> files.
> This is the reason why the Ant task does have 2 modes:
> - The strict mode which forbids any change in the working copy, so it must be 
> 100% clean. By default, Ant only runs this if the property "is.jenkins.build" 
> is enabled. The reason for that is to detect any change in the working copy 
> caused by running the Jenkins CI (like temporary files munging around).
> - The default "committer/developer" mode: In this case the working copy check 
> only complains about "untracked" or "missing" files. So a committer who 
> changes some files can still pass precommit. If he adds a new file, he has to 
> add it to the index, so its not untracked. But generally normal modifications 
> of working copy are allowed.
> Please add this back. There was a reason why I set up the check-working-copy 
> Ant tak like it was.
> If others aggree, i'd like to change the task so it has two modes:
> - Full clean mode (for CI builds), enabled only if it's a CI build -- we 
> should maybe add some tasks like "jenkins-hourly"on root project that enables 
> this mode
> - Developer mode (default), that does not care about "modified" files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Assigned] (LUCENE-9184) Add relaxed mode for "checkWorkingCopyClean" in Gradle build (similar to Ant)

2020-01-27 Thread Uwe Schindler (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler reassigned LUCENE-9184:
-

Assignee: Dawid Weiss

> Add relaxed mode for "checkWorkingCopyClean" in Gradle build (similar to Ant)
> -
>
> Key: LUCENE-9184
> URL: https://issues.apache.org/jira/browse/LUCENE-9184
> Project: Lucene - Core
>  Issue Type: Wish
>  Components: general/build
>Reporter: Uwe Schindler
>Assignee: Dawid Weiss
>Priority: Major
> Fix For: master (9.0)
>
>
> Depending on the type of Git Client you are using (I hate the command line, I 
> use Eclipse Git or TortoiseGit -- my preference), the way how files are 
> committed differs. Normally with git command line you would first stage all 
> files and then commit them. If you stage them and then run precommit, it 
> works fine, as the "changed" and "added" and other stati are ignored and its 
> still confirmed as "clean". After the pre-COMMIT task you finally commit.
> But Git GUIs don't have the concept of staging. You can (similar to 
> Subversion) add files and delete files, but when you modify a file you cannot 
> explicitely "stage" the change. What you do is to open the commit GUI, put 
> checkboxes on all files you want to commit and then the GUI triggers a stage 
> and commit directly after each other.
> In this workflow, the precommit check of course complains about "modified" 
> files.
> This is the reason why the Ant task does have 2 modes:
> - The strict mode which forbids any change in the working copy, so it must be 
> 100% clean. By default, Ant only runs this if the property "is.jenkins.build" 
> is enabled. The reason for that is to detect any change in the working copy 
> caused by running the Jenkins CI (like temporary files munging around).
> - The default "committer/developer" mode: In this case the working copy check 
> only complains about "untracked" or "missing" files. So a committer who 
> changes some files can still pass precommit. If he adds a new file, he has to 
> add it to the index, so its not untracked. But generally normal modifications 
> of working copy are allowed.
> Please add this back. There was a reason why I set up the check-working-copy 
> Ant tak like it was.
> If others aggree, i'd like to change the task so it has two modes:
> - Full clean mode (for CI builds), enabled only if it's a CI build -- we 
> should maybe add some tasks like "jenkins-hourly"on root project that enables 
> this mode
> - Developer mode (default), that does not care about "modified" files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-9183) Allow optional skipping of git status check in precommit

2020-01-27 Thread Uwe Schindler (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler resolved LUCENE-9183.
---
Resolution: Duplicate

> Allow optional skipping of git status check in precommit
> 
>
> Key: LUCENE-9183
> URL: https://issues.apache.org/jira/browse/LUCENE-9183
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Trivial
>
> Had an offline conversation with Uwe about it. For people who don't use git 
> staging
> (only IDEs) the precommit may be problematic as it currently fails on locally 
> changed 
> files. 
> I'll add an option to skip it, if the developer so desires.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Assigned] (LUCENE-9184) Add relaxed mode for "checkWorkingCopyClean" in Gradle build (similar to Ant)

2020-01-27 Thread Uwe Schindler (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler reassigned LUCENE-9184:
-

Assignee: Uwe Schindler  (was: Dawid Weiss)

> Add relaxed mode for "checkWorkingCopyClean" in Gradle build (similar to Ant)
> -
>
> Key: LUCENE-9184
> URL: https://issues.apache.org/jira/browse/LUCENE-9184
> Project: Lucene - Core
>  Issue Type: Wish
>  Components: general/build
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
>Priority: Major
> Fix For: master (9.0)
>
>
> Depending on the type of Git Client you are using (I hate the command line, I 
> use Eclipse Git or TortoiseGit -- my preference), the way how files are 
> committed differs. Normally with git command line you would first stage all 
> files and then commit them. If you stage them and then run precommit, it 
> works fine, as the "changed" and "added" and other stati are ignored and its 
> still confirmed as "clean". After the pre-COMMIT task you finally commit.
> But Git GUIs don't have the concept of staging. You can (similar to 
> Subversion) add files and delete files, but when you modify a file you cannot 
> explicitely "stage" the change. What you do is to open the commit GUI, put 
> checkboxes on all files you want to commit and then the GUI triggers a stage 
> and commit directly after each other.
> In this workflow, the precommit check of course complains about "modified" 
> files.
> This is the reason why the Ant task does have 2 modes:
> - The strict mode which forbids any change in the working copy, so it must be 
> 100% clean. By default, Ant only runs this if the property "is.jenkins.build" 
> is enabled. The reason for that is to detect any change in the working copy 
> caused by running the Jenkins CI (like temporary files munging around).
> - The default "committer/developer" mode: In this case the working copy check 
> only complains about "untracked" or "missing" files. So a committer who 
> changes some files can still pass precommit. If he adds a new file, he has to 
> add it to the index, so its not untracked. But generally normal modifications 
> of working copy are allowed.
> Please add this back. There was a reason why I set up the check-working-copy 
> Ant tak like it was.
> If others aggree, i'd like to change the task so it has two modes:
> - Full clean mode (for CI builds), enabled only if it's a CI build -- we 
> should maybe add some tasks like "jenkins-hourly"on root project that enables 
> this mode
> - Developer mode (default), that does not care about "modified" files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Closed] (LUCENE-9183) Allow optional skipping of git status check in precommit

2020-01-27 Thread Uwe Schindler (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler closed LUCENE-9183.
-

> Allow optional skipping of git status check in precommit
> 
>
> Key: LUCENE-9183
> URL: https://issues.apache.org/jira/browse/LUCENE-9183
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Trivial
>
> Had an offline conversation with Uwe about it. For people who don't use git 
> staging
> (only IDEs) the precommit may be problematic as it currently fails on locally 
> changed 
> files. 
> I'll add an option to skip it, if the developer so desires.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (LUCENE-9184) Add relaxed mode for "checkWorkingCopyClean" in Gradle build (similar to Ant)

2020-01-27 Thread Uwe Schindler (Jira)

Uwe Schindler created LUCENE-9184:
-

 Summary: Add relaxed mode for "checkWorkingCopyClean" in Gradle 
build (similar to Ant)
 Key: LUCENE-9184
 URL: https://issues.apache.org/jira/browse/LUCENE-9184
 Project: Lucene - Core
  Issue Type: Wish
  Components: general/build
Reporter: Uwe Schindler
 Fix For: master (9.0)


Depending on the type of Git Client you are using (I hate the command line, I 
use Eclipse Git or TortoiseGit -- my preference), the way how files are 
committed differs. Normally with git command line you would first stage all 
files and then commit them. If you stage them and then run precommit, it works 
fine, as the "changed" and "added" and other stati are ignored and its still 
confirmed as "clean". After the pre-COMMIT task you finally commit.

But Git GUIs don't have the concept of staging. You can (similar to Subversion) 
add files and delete files, but when you modify a file you cannot explicitely 
"stage" the change. What you do is to open the commit GUI, put checkboxes on 
all files you want to commit and then the GUI triggers a stage and commit 
directly after each other.

In this workflow, the precommit check of course complains about "modified" 
files.

This is the reason why the Ant task does have 2 modes:
- The strict mode which forbids any change in the working copy, so it must be 
100% clean. By default, Ant only runs this if the property "is.jenkins.build" 
is enabled. The reason for that is to detect any change in the working copy 
caused by running the Jenkins CI (like temporary files munging around).
- The default "committer/developer" mode: In this case the working copy check 
only complains about "untracked" or "missing" files. So a committer who changes 
some files can still pass precommit. If he adds a new file, he has to add it to 
the index, so its not untracked. But generally normal modifications of working 
copy are allowed.

Please add this back. There was a reason why I set up the check-working-copy 
Ant tak like it was.

If others aggree, i'd like to change the task so it has two modes:
- Full clean mode (for CI builds), enabled only if it's a CI build -- we should 
maybe add some tasks like "jenkins-hourly"on root project that enables this mode
- Developer mode (default), that does not care about "modified" files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (LUCENE-9183) Allow optional skipping of git status check in precommit

2020-01-27 Thread Dawid Weiss (Jira)

Dawid Weiss created LUCENE-9183:
---

 Summary: Allow optional skipping of git status check in precommit
 Key: LUCENE-9183
 URL: https://issues.apache.org/jira/browse/LUCENE-9183
 Project: Lucene - Core
  Issue Type: Task
Reporter: Dawid Weiss
Assignee: Dawid Weiss


Had an offline conversation with Uwe about it. For people who don't use git 
staging
(only IDEs) the precommit may be problematic as it currently fails on locally 
changed 
files. 

I'll add an option to skip it, if the developer so desires.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9171) Synonyms Boost by Payload

2020-01-27 Thread Alessandro Benedetti (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024589#comment-17024589
 ] 

Alessandro Benedetti commented on LUCENE-9171:
--

Thanks [~romseygeek], your feedback has been extremely valuable.
I proceeded with the implementation.
the code is attached to the PR and it seems much cleaner to me now that I 
followed the AttributeSource approach.

Let me know,


> Synonyms Boost by Payload
> -
>
> Key: LUCENE-9171
> URL: https://issues.apache.org/jira/browse/LUCENE-9171
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: core/queryparser
>Reporter: Alessandro Benedetti
>Priority: Major
>
> I have been working in the additional capability of boosting queries by terms 
> payload through a parameter to enable it in Lucene Query Builder.
> This has been done targeting the Synonyms Query.
> It is parametric, so it meant to see no difference unless the feature is 
> enabled.
> Solr has its bits to comply thorugh its SynonymsQueryStyles



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14201) some SolrCore are not released after being removed

2020-01-27 Thread Christine Poerschke (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024584#comment-17024584
 ] 

Christine Poerschke commented on SOLR-14201:


Thanks [~vinhlh] for sharing these steps to reproduce the issue, and details of 
the unreleased classes.

I notice that 'optimise' and 'alias' use is part of the steps; if one or both 
of them was omitted and the issue then did or did not continue to happen, that 
might provide further insights, if not already tried?

Specifically w.r.t. the 'optimise' step, it might be interesting to explore if 
the optimise has finished by the time the collection deletion is requested. 
[~GoodmanR]'s SOLR-13609 ticket is also about visibility into optimise progress.

> some SolrCore are not released after being removed
> --
>
> Key: SOLR-14201
> URL: https://issues.apache.org/jira/browse/SOLR-14201
> Project: Solr
>  Issue Type: Bug
>Reporter: Christine Poerschke
>Priority: Major
> Attachments: image-2020-01-22-10-39-15-301.png, 
> image-2020-01-22-10-42-17-511.png, image-2020-01-22-12-28-46-241.png, 
> image-2020-01-22-14-45-52-730.png
>
>
> [~vinhlh] reported in SOLR-10506 (affecting 6.5 with fixes in 6.6.6 and 7.0):
> bq. In 7.7.2, some SolrCore still are not released after being removed.
> https://issues.apache.org/jira/secure/attachment/12991357/image-2020-01-20-14-51-26-411.png
> Starting this ticket for a separate investigation and fix. A next 
> investigative step could be to try and reproduce the issue on the latest 8.x 
> release.
>   
>   
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14213) Configuring Solr Cloud to use Shared Storage

2020-01-27 Thread Andy Vuong (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024570#comment-17024570
 ] 

Andy Vuong commented on SOLR-14213:
---

This should probably be in solr.xml and not solrconfig.xml as I said above.

> Configuring Solr Cloud to use Shared Storage
> 
>
> Key: SOLR-14213
> URL: https://issues.apache.org/jira/browse/SOLR-14213
> Project: Solr
>  Issue Type: Sub-task
>  Components: SolrCloud
>Reporter: Andy Vuong
>Priority: Minor
>
>  Clients can currently create shared collections by sending a collection 
> admin command such as
> *_solr/admin/collections?action=CREATE=gettingstarted=true=1_*
>  
> There are a set of shared storage specific classes such as 
> SharedStorageManager that get initialized on startup when the CoreContainer 
> loads. There are also components that are lazily loaded when shared storage 
> functionality is needed. This was initially written this way because a Solr 
> Cloud cluster could spin up and not used shared collections in which case 
> shared store components wouldn’t need to be loaded. There is also no support 
> for configuring Solr Cloud to use shared storage via config files. Lazy 
> loading leads to some poor code and initialization flow that should be 
> revisited.
> This JIRA is for designing the configuration of Solr Cloud to use shared 
> storage and initializing shared storage components based on this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-8776) Support RankQuery in grouping

2020-01-27 Thread David White (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-8776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024561#comment-17024561
 ] 

David White commented on SOLR-8776:
---

Is there any plan on moving this fix forward into an official version of Solr? 
This is a crucial bug.

> Support RankQuery in grouping
> -
>
> Key: SOLR-8776
> URL: https://issues.apache.org/jira/browse/SOLR-8776
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Affects Versions: 6.0
>Reporter: Diego Ceccarelli
>Priority: Minor
> Attachments: 0001-SOLR-8776-Support-RankQuery-in-grouping.patch, 
> 0001-SOLR-8776-Support-RankQuery-in-grouping.patch, 
> 0001-SOLR-8776-Support-RankQuery-in-grouping.patch, 
> 0001-SOLR-8776-Support-RankQuery-in-grouping.patch, 
> 0001-SOLR-8776-Support-RankQuery-in-grouping.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Currently it is not possible to use RankQuery [1] and Grouping [2] together 
> (see also [3]). In some situations Grouping can be replaced by Collapse and 
> Expand Results [4] (that supports reranking), but i) collapse cannot 
> guarantee that at least a minimum number of groups will be returned for a 
> query, and ii) in the Solr Cloud setting you will have constraints on how to 
> partition the documents among the shards.
> I'm going to start working on supporting RankQuery in grouping. I'll start 
> attaching a patch with a test that fails because grouping does not support 
> the rank query and then I'll try to fix the problem, starting from the non 
> distributed setting (GroupingSearch).
> My feeling is that since grouping is mostly performed by Lucene, RankQuery 
> should be refactored and moved (or partially moved) there. 
> Any feedback is welcome.
> [1] https://cwiki.apache.org/confluence/display/solr/RankQuery+API 
> [2] https://cwiki.apache.org/confluence/display/solr/Result+Grouping
> [3] 
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201507.mbox/%3ccahm-lpuvspest-sw63_8a6gt-wor6ds_t_nb2rope93e4+s...@mail.gmail.com%3E
> [4] 
> https://cwiki.apache.org/confluence/display/solr/Collapse+and+Expand+Results



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] andywebb1975 commented on a change in pull request #1210: SOLR-14219 force serialVersionUID of OverseerSolrResponse

2020-01-27 Thread GitBox

andywebb1975 commented on a change in pull request #1210: SOLR-14219 force 
serialVersionUID of OverseerSolrResponse
URL: https://github.com/apache/lucene-solr/pull/1210#discussion_r371411317
 
 

 ##
 File path: solr/core/src/java/org/apache/solr/cloud/OverseerSolrResponse.java
 ##
 @@ -26,7 +26,9 @@
 import java.util.Objects;
 
 public class OverseerSolrResponse extends SolrResponse {
-  
+ 
+  private static final long serialVersionUID = 4721653044098960880L;
 
 Review comment:
   hi Tomas, that's my understanding too - and it's possible that people may be 
running builds of earlier Solr versions whose serialVersionUID for this class 
differ from 472165... . 
   
   I've some (slightly shaky!) evidence in support of doing it this way: in 
https://github.com/apache/lucene-solr/pull/1140 for SOLR-14165 I saw that the 
same earlier UID for SolrResponse had been generated by several different 
stacks, but it's possible the stacks weren't sufficiently different. Also I can 
see about a dozen instances of serialVersionUID being set to explicit values 
(other than 1) in the Lucene/Solr codebase, presumably for reasons similar to 
the current one - though to be fair I've no way to tell if these have caused 
compatibility issues with custom builds in the past.
   
   I do think that serialVersionUID should be set explicitly for all 
serializable classes, and that setting it to the value used in the earlier 
official release builds is the best value to use when it's being set 
retrospectively like this.
   
   In the interests of finding other approaches to making the serialization vs 
javabin change backwards-compatible, I've just run a build that didn't set 
serialVersionUID explicitly but made useUnsafeSerialization and 
useUnsafeDeserialization private, to see if this would give me the same default 
UID as before. It didn't - I got a third value -550706... instead. So as far as 
I can see the only options here are to remove all the changes from 
OverseerSolrResponse and put them elsewhere as you describe above, or to set 
its serialVersionUID to a possibly-incompatible value as I've done in this PR.
   
   hope this helps!
   Andy


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4702) Terms dictionary compression

2020-01-27 Thread Adrien Grand (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-4702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024532#comment-17024532
 ] 

Adrien Grand commented on LUCENE-4702:
--

OK I benchmarked with multi-segment indices this time to try to better 
replicate nightly benchmarks. I opened a pull request at 
https://github.com/apache/lucene-solr/pull/1216 that:
 - removes compression of suffix lengths since it didn't help much anymay,
 - replaces LZ4 on stats by explicit run-length compression
 - only tries out LZ4 for suffix bytes if the average suffix length is > 6 to 
reduce index-time overhead since it's unlikely to meet the saving expectations 
otherwise anyway, in order to reduce index-time overhead

On wikibigall, the specialized RLE makes the tim file even smaller with this 
change (969MB vs. 996MB) and luceneutil seems to be a bit more happy:

{noformat}
TaskQPS baseline  StdDev   QPS patch  StdDev
Pct diff
  IntNRQ  144.16  (1.2%)  143.47  (1.9%)   
-0.5% (  -3% -2%)
TermBGroup1M   32.04  (5.1%)   31.93  (5.1%)   
-0.4% ( -10% -   10%)
  TermDTSort   39.13  (0.9%)   39.05  (1.0%)   
-0.2% (  -2% -1%)
 TermGroup1M   40.18  (4.0%)   40.12  (3.4%)   
-0.2% (  -7% -7%)
   TermTitleSort  124.62  (1.9%)  124.54  (1.6%)   
-0.1% (  -3% -3%)
   TermDayOfYearSort   88.37  (6.9%)   88.34  (7.1%)   
-0.0% ( -13% -   14%)
TermGroup10K   28.56  (5.0%)   28.56  (4.4%)
0.0% (  -8% -9%)
IntervalsOrdered4.50  (1.1%)4.51  (0.6%)
0.0% (  -1% -1%)
  TermBGroup1M1P   45.83  (4.1%)   45.85  (4.0%)
0.0% (  -7% -8%)
   TermMonthSort  137.33  (1.8%)  137.40  (1.3%)
0.1% (  -2% -3%)
 AndHighHigh   72.97  (2.8%)   73.05  (2.7%)
0.1% (  -5% -5%)
   OrHighMed   77.75  (2.7%)   77.85  (2.7%)
0.1% (  -5% -5%)
SpanNear   10.66  (1.2%)   10.68  (1.2%)
0.2% (  -2% -2%)
  Phrase   59.75  (4.9%)   59.91  (5.2%)
0.3% (  -9% -   10%)
Term 1358.87  (6.8%) 1363.02  (6.1%)
0.3% ( -11% -   14%)
AndMedOrHighHigh   28.18  (3.0%)   28.27  (2.5%)
0.3% (  -5% -6%)
  OrHighHigh   18.55  (3.2%)   18.61  (2.2%)
0.3% (  -4% -5%)
SloppyPhrase   19.41  (3.9%)   19.49  (3.5%)
0.4% (  -6% -8%)
  AndHighMed   65.81  (2.8%)   66.15  (2.4%)
0.5% (  -4% -5%)
 AndHighOrMedMed   36.49  (2.5%)   36.69  (1.9%)
0.5% (  -3% -5%)
TermGroup100   12.19  (3.9%)   12.27  (4.0%)
0.6% (  -7% -8%)
PKLookup  217.61  (3.2%)  220.39  (3.3%)
1.3% (  -5% -8%)
 Prefix3  197.95  (3.3%)  202.32  (3.4%)
2.2% (  -4% -9%)
Wildcard   37.78  (2.2%)   41.43  (2.8%)
9.6% (   4% -   14%)
  Fuzzy1   47.77  (5.5%)   53.35  (8.4%)   
11.7% (  -2% -   27%)
  Fuzzy2   43.69  (7.5%)   49.50 (10.7%)   
13.3% (  -4% -   34%)
 Respell   34.05  (1.6%)   41.94  (1.4%)   
23.2% (  19% -   26%)
{noformat}

I plan to commit it and see how that affects nigthly benchmarks.

> Terms dictionary compression
> 
>
> Key: LUCENE-4702
> URL: https://issues.apache.org/jira/browse/LUCENE-4702
> Project: Lucene - Core
>  Issue Type: Wish
>Reporter: Adrien Grand
>Assignee: Adrien Grand
>Priority: Trivial
> Attachments: LUCENE-4702.patch, LUCENE-4702.patch
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> I've done a quick test with the block tree terms dictionary by replacing a 
> call to IndexOutput.writeBytes to write suffix bytes with a call to 
> LZ4.compressHC to test the peformance hit. Interestingly, search performance 
> was very good (see comparison table below) and the tim files were 14% smaller 
> (from 150432 bytes overall to 129516).
> {noformat}
> TaskQPS baseline  StdDevQPS compressed  StdDev
> Pct diff
>   Fuzzy1  111.50  (2.0%)   78.78  (1.5%)  
> -29.4% ( -32% -  -26%)
>   Fuzzy2   36.99  (2.7%)   28.59  (1.5%)  
> -22.7% ( -26% -  -18%)
>  Respell  122.86  (2.1%)  103.89  (1.7%)  
> -15.4% ( -18% -  -11%)
>

[jira] [Commented] (SOLR-14040) solr.xml shareSchema does not work in SolrCloud

2020-01-27 Thread Chris M. Hostetter (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024531#comment-17024531
 ] 

Chris M. Hostetter commented on SOLR-14040:
---

David: we're still seeing a much higher rate of  jenkins failures from 
TestBulkSchemaConcurrent since your changes then we've ever seen in the past 
... these don't appear to reproduce reliably, suggesting that there is some 
sort of timing/concurrency issue at play (not suprising given the nature of the 
changes and the nature of the test)

Have you investigated these at all?

> solr.xml shareSchema does not work in SolrCloud
> ---
>
> Key: SOLR-14040
> URL: https://issues.apache.org/jira/browse/SOLR-14040
> Project: Solr
>  Issue Type: Improvement
>  Components: Schema and Analysis
>Reporter: David Smiley
>Assignee: David Smiley
>Priority: Major
> Fix For: 8.5
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> solr.xml has a shareSchema boolean option that can be toggled from the 
> default of false to true in order to share IndexSchema objects within the 
> Solr node.  This is silently ignored in SolrCloud mode.  The pertinent code 
> is {{org.apache.solr.core.ConfigSetService#createConfigSetService}} which 
> creates a CloudConfigSetService that is not related to the SchemaCaching 
> class.  This may not be a big deal in SolrCloud which tends not to deal well 
> with many cores per node but I'm working on changing that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9164) Should not consider ACE a tragedy if IW is closed

2020-01-27 Thread Lucene/Solr QA (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024529#comment-17024529
 ] 

Lucene/Solr QA commented on LUCENE-9164:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
21s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Release audit (RAT) {color} | 
{color:green}  0m 23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Check forbidden APIs {color} | 
{color:green}  0m 23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Validate source patterns {color} | 
{color:green}  0m 23s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
49s{color} | {color:green} core in the patch passed. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}  5m 55s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | LUCENE-9164 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12991924/LUCENE-9164.patch |
| Optional Tests |  compile  javac  unit  ratsources  checkforbiddenapis  
validatesourcepatterns  |
| uname | Linux lucene1-us-west 4.15.0-54-generic #58-Ubuntu SMP Mon Jun 24 
10:55:24 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | ant |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-LUCENE-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh
 |
| git revision | master / 9e4c445d174 |
| ant | version: Apache Ant(TM) version 1.10.5 compiled on March 28 2019 |
| Default Java | LTS |
|  Test Results | 
https://builds.apache.org/job/PreCommit-LUCENE-Build/251/testReport/ |
| modules | C: lucene/core U: lucene/core |
| Console output | 
https://builds.apache.org/job/PreCommit-LUCENE-Build/251/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> Should not consider ACE a tragedy if IW is closed
> -
>
> Key: LUCENE-9164
> URL: https://issues.apache.org/jira/browse/LUCENE-9164
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: master (9.0), 8.5, 8.4.2
>Reporter: Nhat Nguyen
>Assignee: Nhat Nguyen
>Priority: Major
> Attachments: LUCENE-9164.patch, LUCENE-9164.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> If IndexWriter is closed or being closed, AlreadyClosedException is expected. 
> We should not consider it a tragic event in this case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] jpountz opened a new pull request #1216: LUCENE-4702: Reduce terms dictionary compression overhead.

2020-01-27 Thread GitBox

jpountz opened a new pull request #1216: LUCENE-4702: Reduce terms dictionary 
compression overhead.
URL: https://github.com/apache/lucene-solr/pull/1216
 
 
   Changes include:
- Removed LZ4 compression of suffix lengths which didn't save much space
  anyway.
- For stats, LZ4 was only really used for run-length compression of terms 
whose
  docFreq is 1. This has been replaced by explicit run-length compression.
- Since we only use LZ4 for suffix bytes if the compression ration is < 
75%, we
  now only try LZ4 out if the average suffix length is greater than 6, in 
order
  to reduce index-time overhead.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4702) Terms dictionary compression

2020-01-27 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-4702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024523#comment-17024523
 ] 

ASF subversion and git services commented on LUCENE-4702:
-

Commit 9e4c445d17415e8b8433872df4e263d1ef144dba in lucene-solr's branch 
refs/heads/master from Adrien Grand
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=9e4c445 ]

LUCENE-4702: CHANGES entry.


> Terms dictionary compression
> 
>
> Key: LUCENE-4702
> URL: https://issues.apache.org/jira/browse/LUCENE-4702
> Project: Lucene - Core
>  Issue Type: Wish
>Reporter: Adrien Grand
>Assignee: Adrien Grand
>Priority: Trivial
> Attachments: LUCENE-4702.patch, LUCENE-4702.patch
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> I've done a quick test with the block tree terms dictionary by replacing a 
> call to IndexOutput.writeBytes to write suffix bytes with a call to 
> LZ4.compressHC to test the peformance hit. Interestingly, search performance 
> was very good (see comparison table below) and the tim files were 14% smaller 
> (from 150432 bytes overall to 129516).
> {noformat}
> TaskQPS baseline  StdDevQPS compressed  StdDev
> Pct diff
>   Fuzzy1  111.50  (2.0%)   78.78  (1.5%)  
> -29.4% ( -32% -  -26%)
>   Fuzzy2   36.99  (2.7%)   28.59  (1.5%)  
> -22.7% ( -26% -  -18%)
>  Respell  122.86  (2.1%)  103.89  (1.7%)  
> -15.4% ( -18% -  -11%)
> Wildcard  100.58  (4.3%)   94.42  (3.2%)   
> -6.1% ( -13% -1%)
>  Prefix3  124.90  (5.7%)  122.67  (4.7%)   
> -1.8% ( -11% -9%)
>OrHighLow  169.87  (6.8%)  167.77  (8.0%)   
> -1.2% ( -15% -   14%)
>  LowTerm 1949.85  (4.5%) 1929.02  (3.4%)   
> -1.1% (  -8% -7%)
>   AndHighLow 2011.95  (3.5%) 1991.85  (3.3%)   
> -1.0% (  -7% -5%)
>   OrHighHigh  155.63  (6.7%)  154.12  (7.9%)   
> -1.0% ( -14% -   14%)
>  AndHighHigh  341.82  (1.2%)  339.49  (1.7%)   
> -0.7% (  -3% -2%)
>OrHighMed  217.55  (6.3%)  216.16  (7.1%)   
> -0.6% ( -13% -   13%)
>   IntNRQ   53.10 (10.9%)   52.90  (8.6%)   
> -0.4% ( -17% -   21%)
>  MedTerm  998.11  (3.8%)  994.82  (5.6%)   
> -0.3% (  -9% -9%)
>  MedSpanNear   60.50  (3.7%)   60.36  (4.8%)   
> -0.2% (  -8% -8%)
> HighSpanNear   19.74  (4.5%)   19.72  (5.1%)   
> -0.1% (  -9% -9%)
>  LowSpanNear  101.93  (3.2%)  101.82  (4.4%)   
> -0.1% (  -7% -7%)
>   AndHighMed  366.18  (1.7%)  366.93  (1.7%)
> 0.2% (  -3% -3%)
> PKLookup  237.28  (4.0%)  237.96  (4.2%)
> 0.3% (  -7% -8%)
>MedPhrase  173.17  (4.7%)  174.69  (4.7%)
> 0.9% (  -8% -   10%)
>  LowSloppyPhrase  180.91  (2.6%)  182.79  (2.7%)
> 1.0% (  -4% -6%)
>LowPhrase  374.64  (5.5%)  379.11  (5.8%)
> 1.2% (  -9% -   13%)
> HighTerm  253.14  (7.9%)  256.97 (11.4%)
> 1.5% ( -16% -   22%)
>   HighPhrase   19.52 (10.6%)   19.83 (11.0%)
> 1.6% ( -18% -   25%)
>  MedSloppyPhrase  141.90  (2.6%)  144.11  (2.5%)
> 1.6% (  -3% -6%)
> HighSloppyPhrase   25.26  (4.8%)   25.97  (5.0%)
> 2.8% (  -6% -   13%)
> {noformat}
> Only queries which are very terms-dictionary-intensive got a performance hit 
> (Fuzzy, Fuzzy2, Respell, Wildcard), other queries including Prefix3 behaved 
> (surprisingly) well.
> Do you think of it as something worth exploring?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9116) Simplify postings API by removing long[] metadata

2020-01-27 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024521#comment-17024521
 ] 

ASF subversion and git services commented on LUCENE-9116:
-

Commit ace4fcc7be47e171d37932a191d646f1924a9319 in lucene-solr's branch 
refs/heads/branch_8x from Adrien Grand
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=ace4fcc ]

LUCENE-9116: Remove long[] from `PostingsWriterBase#encodeTerm`. (#1149) (#1158)

All the metadata can be directly encoded in the `DataOutput`.


> Simplify postings API by removing long[] metadata
> -
>
> Key: LUCENE-9116
> URL: https://issues.apache.org/jira/browse/LUCENE-9116
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Adrien Grand
>Priority: Minor
> Fix For: 8.5
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The postings API allows to store metadata about a term either in a long[] or 
> in a byte[]. This is unnecessary as all information could be encoded in the 
> byte[], which is what most codecs do in practice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4702) Terms dictionary compression

2020-01-27 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-4702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024522#comment-17024522
 ] 

ASF subversion and git services commented on LUCENE-4702:
-

Commit 666bdac64d68c3f247760d0a2a1c7a441502af1e in lucene-solr's branch 
refs/heads/branch_8x from Adrien Grand
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=666bdac ]

LUCENE-4702: CHANGES entry.


> Terms dictionary compression
> 
>
> Key: LUCENE-4702
> URL: https://issues.apache.org/jira/browse/LUCENE-4702
> Project: Lucene - Core
>  Issue Type: Wish
>Reporter: Adrien Grand
>Assignee: Adrien Grand
>Priority: Trivial
> Attachments: LUCENE-4702.patch, LUCENE-4702.patch
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> I've done a quick test with the block tree terms dictionary by replacing a 
> call to IndexOutput.writeBytes to write suffix bytes with a call to 
> LZ4.compressHC to test the peformance hit. Interestingly, search performance 
> was very good (see comparison table below) and the tim files were 14% smaller 
> (from 150432 bytes overall to 129516).
> {noformat}
> TaskQPS baseline  StdDevQPS compressed  StdDev
> Pct diff
>   Fuzzy1  111.50  (2.0%)   78.78  (1.5%)  
> -29.4% ( -32% -  -26%)
>   Fuzzy2   36.99  (2.7%)   28.59  (1.5%)  
> -22.7% ( -26% -  -18%)
>  Respell  122.86  (2.1%)  103.89  (1.7%)  
> -15.4% ( -18% -  -11%)
> Wildcard  100.58  (4.3%)   94.42  (3.2%)   
> -6.1% ( -13% -1%)
>  Prefix3  124.90  (5.7%)  122.67  (4.7%)   
> -1.8% ( -11% -9%)
>OrHighLow  169.87  (6.8%)  167.77  (8.0%)   
> -1.2% ( -15% -   14%)
>  LowTerm 1949.85  (4.5%) 1929.02  (3.4%)   
> -1.1% (  -8% -7%)
>   AndHighLow 2011.95  (3.5%) 1991.85  (3.3%)   
> -1.0% (  -7% -5%)
>   OrHighHigh  155.63  (6.7%)  154.12  (7.9%)   
> -1.0% ( -14% -   14%)
>  AndHighHigh  341.82  (1.2%)  339.49  (1.7%)   
> -0.7% (  -3% -2%)
>OrHighMed  217.55  (6.3%)  216.16  (7.1%)   
> -0.6% ( -13% -   13%)
>   IntNRQ   53.10 (10.9%)   52.90  (8.6%)   
> -0.4% ( -17% -   21%)
>  MedTerm  998.11  (3.8%)  994.82  (5.6%)   
> -0.3% (  -9% -9%)
>  MedSpanNear   60.50  (3.7%)   60.36  (4.8%)   
> -0.2% (  -8% -8%)
> HighSpanNear   19.74  (4.5%)   19.72  (5.1%)   
> -0.1% (  -9% -9%)
>  LowSpanNear  101.93  (3.2%)  101.82  (4.4%)   
> -0.1% (  -7% -7%)
>   AndHighMed  366.18  (1.7%)  366.93  (1.7%)
> 0.2% (  -3% -3%)
> PKLookup  237.28  (4.0%)  237.96  (4.2%)
> 0.3% (  -7% -8%)
>MedPhrase  173.17  (4.7%)  174.69  (4.7%)
> 0.9% (  -8% -   10%)
>  LowSloppyPhrase  180.91  (2.6%)  182.79  (2.7%)
> 1.0% (  -4% -6%)
>LowPhrase  374.64  (5.5%)  379.11  (5.8%)
> 1.2% (  -9% -   13%)
> HighTerm  253.14  (7.9%)  256.97 (11.4%)
> 1.5% ( -16% -   22%)
>   HighPhrase   19.52 (10.6%)   19.83 (11.0%)
> 1.6% ( -18% -   25%)
>  MedSloppyPhrase  141.90  (2.6%)  144.11  (2.5%)
> 1.6% (  -3% -6%)
> HighSloppyPhrase   25.26  (4.8%)   25.97  (5.0%)
> 2.8% (  -6% -   13%)
> {noformat}
> Only queries which are very terms-dictionary-intensive got a performance hit 
> (Fuzzy, Fuzzy2, Respell, Wildcard), other queries including Prefix3 behaved 
> (surprisingly) well.
> Do you think of it as something worth exploring?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1171: SOLR-13892: Add 'top-level' docValues Join implementation

2020-01-27 Thread GitBox

dsmiley commented on a change in pull request #1171: SOLR-13892: Add 
'top-level' docValues Join implementation
URL: https://github.com/apache/lucene-solr/pull/1171#discussion_r371376748
 
 

 ##
 File path: solr/core/src/java/org/apache/solr/search/TopLevelJoinQuery.java
 ##
 @@ -0,0 +1,218 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.search;
+
+import java.io.IOException;
+import java.lang.invoke.MethodHandles;
+
+import org.apache.lucene.index.DocValues;
+import org.apache.lucene.index.LeafReader;
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.index.SortedSetDocValues;
+import org.apache.lucene.search.Collector;
+import org.apache.lucene.search.ConstantScoreScorer;
+import org.apache.lucene.search.ConstantScoreWeight;
+import org.apache.lucene.search.DocIdSetIterator;
+import org.apache.lucene.search.IndexSearcher;
+import org.apache.lucene.search.Query;
+import org.apache.lucene.search.ScoreMode;
+import org.apache.lucene.search.Scorer;
+import org.apache.lucene.search.TwoPhaseIterator;
+import org.apache.lucene.search.Weight;
+import org.apache.lucene.util.BytesRef;
+import org.apache.lucene.util.LongBitSet;
+import org.apache.solr.common.SolrException;
+import org.apache.solr.schema.IndexSchema;
+import org.apache.solr.schema.SchemaField;
+import org.apache.solr.search.join.MultiValueTermOrdinalCollector;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class TopLevelJoinQuery extends JoinQuery {
 
 Review comment:
   Always add at least one sentence javadoc for a class


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9182) add apache license headers to all .gradle files and enforce in rat task

2020-01-27 Thread Dawid Weiss (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024516#comment-17024516
 ] 

Dawid Weiss commented on LUCENE-9182:
-

Ok.

> add apache license headers to all .gradle files and enforce in rat task
> ---
>
> Key: LUCENE-9182
> URL: https://issues.apache.org/jira/browse/LUCENE-9182
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Assignee: Robert Muir
>Priority: Major
> Fix For: master (9.0)
>
> Attachments: LUCENE-9182.patch
>
>
> Currently rat is ignoring the problem, let's fix it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9182) add apache license headers to all .gradle files and enforce in rat task

2020-01-27 Thread Robert Muir (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024515#comment-17024515
 ] 

Robert Muir commented on LUCENE-9182:
-

Especially this section: 
[https://www.apache.org/legal/src-headers.html#faq-exceptions]

I took a look at other projects such as tomcat, hadoop, etc. I am seeing 
headers for all build.xml, pom.xml, even things like build.properties have 
license headers. The current ant build files in our repo also have license 
headers.

> add apache license headers to all .gradle files and enforce in rat task
> ---
>
> Key: LUCENE-9182
> URL: https://issues.apache.org/jira/browse/LUCENE-9182
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Assignee: Robert Muir
>Priority: Major
> Fix For: master (9.0)
>
> Attachments: LUCENE-9182.patch
>
>
> Currently rat is ignoring the problem, let's fix it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9182) add apache license headers to all .gradle files and enforce in rat task

2020-01-27 Thread Robert Muir (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024514#comment-17024514
 ] 

Robert Muir commented on LUCENE-9182:
-

I think so. This is just my interpretation from reading 
https://www.apache.org/legal/src-headers.html

> add apache license headers to all .gradle files and enforce in rat task
> ---
>
> Key: LUCENE-9182
> URL: https://issues.apache.org/jira/browse/LUCENE-9182
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Assignee: Robert Muir
>Priority: Major
> Fix For: master (9.0)
>
> Attachments: LUCENE-9182.patch
>
>
> Currently rat is ignoring the problem, let's fix it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9182) add apache license headers to all .gradle files and enforce in rat task

2020-01-27 Thread Dawid Weiss (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024512#comment-17024512
 ] 

Dawid Weiss commented on LUCENE-9182:
-

I didn't add them on purpose... are they really required (is it an apache legal 
requirement)? If it's not required I wouldn't bother.

> add apache license headers to all .gradle files and enforce in rat task
> ---
>
> Key: LUCENE-9182
> URL: https://issues.apache.org/jira/browse/LUCENE-9182
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Assignee: Robert Muir
>Priority: Major
> Fix For: master (9.0)
>
> Attachments: LUCENE-9182.patch
>
>
> Currently rat is ignoring the problem, let's fix it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9182) add apache license headers to all .gradle files and enforce in rat task

2020-01-27 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024513#comment-17024513
 ] 

ASF subversion and git services commented on LUCENE-9182:
-

Commit fd5a0ce7c26eff4524b6968b8e84322299516b17 in lucene-solr's branch 
refs/heads/master from Robert Muir
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=fd5a0ce ]

LUCENE-9182: the rat-sources.gradle was the one .gradle file already with a 
license header, we don't need it twice


> add apache license headers to all .gradle files and enforce in rat task
> ---
>
> Key: LUCENE-9182
> URL: https://issues.apache.org/jira/browse/LUCENE-9182
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Assignee: Robert Muir
>Priority: Major
> Fix For: master (9.0)
>
> Attachments: LUCENE-9182.patch
>
>
> Currently rat is ignoring the problem, let's fix it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9164) Should not consider ACE a tragedy if IW is closed

2020-01-27 Thread Nhat Nguyen (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024511#comment-17024511
 ] 

Nhat Nguyen commented on LUCENE-9164:
-

[~atris] Thanks for looking. This is not about double closing. An outstanding 
refresh (i.e., IndexWriter#getReader) considers ACE a tragedy if IndexWriter is 
closed midway. This behavior is bogus and requires another layer of locking.

> Should not consider ACE a tragedy if IW is closed
> -
>
> Key: LUCENE-9164
> URL: https://issues.apache.org/jira/browse/LUCENE-9164
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: master (9.0), 8.5, 8.4.2
>Reporter: Nhat Nguyen
>Assignee: Nhat Nguyen
>Priority: Major
> Attachments: LUCENE-9164.patch, LUCENE-9164.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> If IndexWriter is closed or being closed, AlreadyClosedException is expected. 
> We should not consider it a tragic event in this case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-8143) Remove SpanBoostQuery

2020-01-27 Thread Alan Woodward (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024509#comment-17024509
 ] 

Alan Woodward commented on LUCENE-8143:
---

I'm not sure it's a fault in SpanScorer - PayloadScoreQuery for example will 
adjust a score depending on individual spans matched, so there is a way to do 
it.  It's just that SpanBoostQuery doesn't...

> Remove SpanBoostQuery
> -
>
> Key: LUCENE-8143
> URL: https://issues.apache.org/jira/browse/LUCENE-8143
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Adrien Grand
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I initially added it so that span queries could still be boosted, but this 
> was actually a mistake: boosts are ignored on inner span queries, only the 
> boost of the top-level span query, the one that performs scoring, is not 
> ignored.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-9182) add apache license headers to all .gradle files and enforce in rat task

2020-01-27 Thread Robert Muir (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-9182.
-
Fix Version/s: master (9.0)
   Resolution: Fixed

> add apache license headers to all .gradle files and enforce in rat task
> ---
>
> Key: LUCENE-9182
> URL: https://issues.apache.org/jira/browse/LUCENE-9182
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Assignee: Robert Muir
>Priority: Major
> Fix For: master (9.0)
>
> Attachments: LUCENE-9182.patch
>
>
> Currently rat is ignoring the problem, let's fix it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9182) add apache license headers to all .gradle files and enforce in rat task

2020-01-27 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024507#comment-17024507
 ] 

ASF subversion and git services commented on LUCENE-9182:
-

Commit 975df9ddd3688fa3530cb975b77005c4eb863d05 in lucene-solr's branch 
refs/heads/master from Robert Muir
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=975df9d ]

LUCENE-9182: add apache license headers to all .gradle files and enforce in rat 
task


> add apache license headers to all .gradle files and enforce in rat task
> ---
>
> Key: LUCENE-9182
> URL: https://issues.apache.org/jira/browse/LUCENE-9182
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Assignee: Robert Muir
>Priority: Major
> Attachments: LUCENE-9182.patch
>
>
> Currently rat is ignoring the problem, let's fix it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9182) add apache license headers to all .gradle files and enforce in rat task

2020-01-27 Thread Robert Muir (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024505#comment-17024505
 ] 

Robert Muir commented on LUCENE-9182:
-

the only interesting changes are to the rat task itself. the rest i auto-gen'd
{code}
diff --git a/gradle/validation/rat-sources.gradle 
b/gradle/validation/rat-sources.gradle
index c50bd5005e0..82875bab1c4 100644
--- a/gradle/validation/rat-sources.gradle
+++ b/gradle/validation/rat-sources.gradle
@@ -1,3 +1,20 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
 /*
  * Licensed to the Apache Software Foundation (ASF) under one
  * or more contributor license agreements.  See the NOTICE file
@@ -40,6 +57,7 @@ configure(rootProject) {
 rat {
 includes += [
 "buildSrc/**/*.java",
+"gradle/**/*.gradle",
 "lucene/tools/forbiddenApis/**",
 "lucene/tools/prettify/**",
 ]
@@ -119,6 +137,7 @@ configure(project(":solr:webapp")) {
 class RatTask extends DefaultTask {
 @Input
 List includes = [
+"*.gradle",
 "*.xml",
 "src/tools/**"
 ]
@@ -131,7 +150,6 @@ class RatTask extends DefaultTask {
 "**/TODO",
 "**/*.txt",
 "**/*.iml",
-"**/*.gradle",
 "build/**"
 ]
{code}

> add apache license headers to all .gradle files and enforce in rat task
> ---
>
> Key: LUCENE-9182
> URL: https://issues.apache.org/jira/browse/LUCENE-9182
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Assignee: Robert Muir
>Priority: Major
> Attachments: LUCENE-9182.patch
>
>
> Currently rat is ignoring the problem, let's fix it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (LUCENE-9182) add apache license headers to all .gradle files and enforce in rat task

2020-01-27 Thread Robert Muir (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-9182:

Attachment: LUCENE-9182.patch

> add apache license headers to all .gradle files and enforce in rat task
> ---
>
> Key: LUCENE-9182
> URL: https://issues.apache.org/jira/browse/LUCENE-9182
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Assignee: Robert Muir
>Priority: Major
> Attachments: LUCENE-9182.patch
>
>
> Currently rat is ignoring the problem, let's fix it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (LUCENE-9182) add apache license headers to all .gradle files and enforce in rat task

2020-01-27 Thread Robert Muir (Jira)

Robert Muir created LUCENE-9182:
---

 Summary: add apache license headers to all .gradle files and 
enforce in rat task
 Key: LUCENE-9182
 URL: https://issues.apache.org/jira/browse/LUCENE-9182
 Project: Lucene - Core
  Issue Type: Task
Reporter: Robert Muir
Assignee: Robert Muir
 Attachments: LUCENE-9182.patch

Currently rat is ignoring the problem, let's fix it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] tflobbe commented on a change in pull request #1210: SOLR-14219 force serialVersionUID of OverseerSolrResponse

2020-01-27 Thread GitBox

tflobbe commented on a change in pull request #1210: SOLR-14219 force 
serialVersionUID of OverseerSolrResponse
URL: https://github.com/apache/lucene-solr/pull/1210#discussion_r371362174
 
 

 ##
 File path: solr/core/src/java/org/apache/solr/cloud/OverseerSolrResponse.java
 ##
 @@ -26,7 +26,9 @@
 import java.util.Objects;
 
 public class OverseerSolrResponse extends SolrResponse {
-  
+ 
+  private static final long serialVersionUID = 4721653044098960880L;
 
 Review comment:
   My understanding is that this number can actually vary depending on the 
compiler, so setting it to a specific value like this (expecting it to be 
number the class had in previous versions) may not work for everyone. 
   Since the changes done in SOLR-14095 are just addition of static methods, 
maybe the better solution is to just revert them from OverseerSolrResponse and 
put them in some other util class


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9181) gradlew(.bat) should pass --parallel flag

2020-01-27 Thread Dawid Weiss (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024494#comment-17024494
 ] 

Dawid Weiss commented on LUCENE-9181:
-

These wrappers are sometimes regenerated so such tweaks would have to be 
reapplied but sounds ok to me! An alternative is to stop the build on the first 
run and require a re-run... Seems lame though.

> gradlew(.bat) should pass --parallel flag
> -
>
> Key: LUCENE-9181
> URL: https://issues.apache.org/jira/browse/LUCENE-9181
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Priority: Major
>
> Followup to LUCENE-9179. 
> For example I have 2 real cores (4 apparent cpus, hyperthreads). 
> With LUCENE-9179 change the build will work the first time, but it will only 
> use one builder and take an eternity.
> Instead if these wrappers passed --parallel then the first build would use 4 
> builders (built in gradle default).
> Subsequent builds for me would only use 2, we still pass --parallel but now 
> our gradle.properties tells it to only use 2
> would give a better first experience (fans spin a bit higher for that first 
> build, but better than slow as hell?)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9178) Use run-length encoding when writing docIds in BKD tree

2020-01-27 Thread Adrien Grand (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024490#comment-17024490
 ] 

Adrien Grand commented on LUCENE-9178:
--

This is an interesting idea. In the multi-valued case we don't have any 
requirements for the order of points in leaves, so we could even sort leaves by 
doc ID in order to make this more likely to kick in. This would break the other 
storage optimization we have that does run-length encoding on the leading byte 
of the dimension that has the shortest shared prefix, but maybe there would be 
greater savings on doc IDs by doing delta plus run-length encoding.

> Use run-length encoding when writing docIds in BKD tree
> ---
>
> Key: LUCENE-9178
> URL: https://issues.apache.org/jira/browse/LUCENE-9178
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Ignacio Vera
>Priority: Major
>
> I think we can easily check if it make sense to write docIds using length 
> compression in the BKD tree. This can probably save some space in the case of 
> Muti value documents, e.g LatLonShape and XYShape.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9179) gradle setupLocalDefaultsOnce can screw up on the first run

2020-01-27 Thread Robert Muir (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024488#comment-17024488
 ] 

Robert Muir commented on LUCENE-9179:
-

I opened LUCENE-9181 for a simple thing we could do to improve that first 
experience.

> gradle setupLocalDefaultsOnce can screw up on the first run
> ---
>
> Key: LUCENE-9179
> URL: https://issues.apache.org/jira/browse/LUCENE-9179
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Assignee: Dawid Weiss
>Priority: Major
>
> To reproduce:
> {noformat}
> rm gradle.properties
> ./gradlew -p lucene test
> {noformat}
> It will fail with a strange error:
> {noformat}
> > Included build in /home/rmuir/workspace/lucene-solr/lucene has name 
> > 'lucene' which is the same as a project of the main build.
> {noformat}
> It makes me wonder if we should try to do this recursive build stuff at all 
> on the first time, or do it a different way (e.g. alternatives are to fail 
> build, or maybe simply invoke ./gradlew ourselves so that it also picks up 
> parallelism changes)? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9175) gradle build leaks tons of gradle-worker-classpath* files in tmpdir

2020-01-27 Thread Dawid Weiss (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024489#comment-17024489
 ] 

Dawid Weiss commented on LUCENE-9175:
-

I think it's a bug in gradle. These files are never cleaned up and temp file 
provider doesn't really clean them up either.
https://github.com/gradle/gradle/issues/12020

> gradle build leaks tons of gradle-worker-classpath* files in tmpdir
> ---
>
> Key: LUCENE-9175
> URL: https://issues.apache.org/jira/browse/LUCENE-9175
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Priority: Major
>
> This may be a sign of classloader issues or similar that cause other issues 
> like LUCENE-9174?
> {noformat}
> $ ls /tmp/gradle-worker-classpath* | wc -l
> 523
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (LUCENE-9181) gradlew(.bat) should pass --parallel flag

2020-01-27 Thread Robert Muir (Jira)

Robert Muir created LUCENE-9181:
---

 Summary: gradlew(.bat) should pass --parallel flag
 Key: LUCENE-9181
 URL: https://issues.apache.org/jira/browse/LUCENE-9181
 Project: Lucene - Core
  Issue Type: Task
Reporter: Robert Muir


Followup to LUCENE-9179. 

For example I have 2 real cores (4 apparent cpus, hyperthreads). 
With LUCENE-9179 change the build will work the first time, but it will only 
use one builder and take an eternity.

Instead if these wrappers passed --parallel then the first build would use 4 
builders (built in gradle default).

Subsequent builds for me would only use 2, we still pass --parallel but now our 
gradle.properties tells it to only use 2

would give a better first experience (fans spin a bit higher for that first 
build, but better than slow as hell?)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] dnhatn opened a new pull request #1215: LUCENE-9164: Ignore ACE on tragic event if IW is closed

2020-01-27 Thread GitBox

dnhatn opened a new pull request #1215: LUCENE-9164: Ignore ACE on tragic event 
if IW is closed
URL: https://github.com/apache/lucene-solr/pull/1215
 
 
   If an IndexWriter was closed, then AlreadyClosedException should not be 
considered a tragic event.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-9179) gradle setupLocalDefaultsOnce can screw up on the first run

2020-01-27 Thread Dawid Weiss (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss resolved LUCENE-9179.
-
Resolution: Fixed

Thanks, didn't know about the issue.

> gradle setupLocalDefaultsOnce can screw up on the first run
> ---
>
> Key: LUCENE-9179
> URL: https://issues.apache.org/jira/browse/LUCENE-9179
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Assignee: Dawid Weiss
>Priority: Major
>
> To reproduce:
> {noformat}
> rm gradle.properties
> ./gradlew -p lucene test
> {noformat}
> It will fail with a strange error:
> {noformat}
> > Included build in /home/rmuir/workspace/lucene-solr/lucene has name 
> > 'lucene' which is the same as a project of the main build.
> {noformat}
> It makes me wonder if we should try to do this recursive build stuff at all 
> on the first time, or do it a different way (e.g. alternatives are to fail 
> build, or maybe simply invoke ./gradlew ourselves so that it also picks up 
> parallelism changes)? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9179) gradle setupLocalDefaultsOnce can screw up on the first run

2020-01-27 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024479#comment-17024479
 ] 

ASF subversion and git services commented on LUCENE-9179:
-

Commit b420ef8f77209690dcd47e45700a952409ccac62 in lucene-solr's branch 
refs/heads/master from Dawid Weiss
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=b420ef8 ]

LUCENE-9179: don't invoke the same build recursively upon first run, just 
continue. Seems like gradle bug but let's not cry about it - it just happens 
once and CI defaults can be passed independently on command-line.


> gradle setupLocalDefaultsOnce can screw up on the first run
> ---
>
> Key: LUCENE-9179
> URL: https://issues.apache.org/jira/browse/LUCENE-9179
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Assignee: Dawid Weiss
>Priority: Major
>
> To reproduce:
> {noformat}
> rm gradle.properties
> ./gradlew -p lucene test
> {noformat}
> It will fail with a strange error:
> {noformat}
> > Included build in /home/rmuir/workspace/lucene-solr/lucene has name 
> > 'lucene' which is the same as a project of the main build.
> {noformat}
> It makes me wonder if we should try to do this recursive build stuff at all 
> on the first time, or do it a different way (e.g. alternatives are to fail 
> build, or maybe simply invoke ./gradlew ourselves so that it also picks up 
> parallelism changes)? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9180) newlines/gitattributes cleanup

2020-01-27 Thread Robert Muir (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024477#comment-17024477
 ] 

Robert Muir commented on LUCENE-9180:
-

I fixed the inconsistent newlines. 

> newlines/gitattributes cleanup
> --
>
> Key: LUCENE-9180
> URL: https://issues.apache.org/jira/browse/LUCENE-9180
> Project: Lucene - Core
>  Issue Type: Task
>  Components: general/build
>Reporter: Robert Muir
>Priority: Major
>
> merge the two .gitattributes files into a single one at the root, fix some 
> random files with DOS newlines that don't need them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9180) newlines/gitattributes cleanup

2020-01-27 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024476#comment-17024476
 ] 

ASF subversion and git services commented on LUCENE-9180:
-

Commit d614bb854d2b2892969c9b1f9de5f12f88f7181f in lucene-solr's branch 
refs/heads/branch_8x from Robert Muir
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=d614bb8 ]

LUCENE-9180: dos2unix files that don't need dos line endings. gitignore 
gradle-specific stuff that shows up modified if you switch branches, no gradle 
here.


> newlines/gitattributes cleanup
> --
>
> Key: LUCENE-9180
> URL: https://issues.apache.org/jira/browse/LUCENE-9180
> Project: Lucene - Core
>  Issue Type: Task
>  Components: general/build
>Reporter: Robert Muir
>Priority: Major
>
> merge the two .gitattributes files into a single one at the root, fix some 
> random files with DOS newlines that don't need them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9180) newlines/gitattributes cleanup

2020-01-27 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024473#comment-17024473
 ] 

ASF subversion and git services commented on LUCENE-9180:
-

Commit 8e357b167bf742aacff39ddfff934a958b0a590d in lucene-solr's branch 
refs/heads/master from Robert Muir
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=8e357b1 ]

LUCENE-9180: dos2unix files that don't need dos line endings


> newlines/gitattributes cleanup
> --
>
> Key: LUCENE-9180
> URL: https://issues.apache.org/jira/browse/LUCENE-9180
> Project: Lucene - Core
>  Issue Type: Task
>  Components: general/build
>Reporter: Robert Muir
>Priority: Major
>
> merge the two .gitattributes files into a single one at the root, fix some 
> random files with DOS newlines that don't need them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-9020) Find a way to publish Solr RefGuide and Javadocs without checking into git

2020-01-27 Thread Uwe Schindler (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler resolved LUCENE-9020.
---
Resolution: Fixed

> Find a way to publish Solr RefGuide and Javadocs without checking into git
> --
>
> Key: LUCENE-9020
> URL: https://issues.apache.org/jira/browse/LUCENE-9020
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Jan Høydahl
>Assignee: Uwe Schindler
>Priority: Major
>
> Currently we check in all versions of RefGuide (hundreds of small html files) 
> into svn to publish as part of the site. With new site we should find a 
> smoother way to do this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9020) Find a way to publish Solr RefGuide and Javadocs without checking into git

2020-01-27 Thread Uwe Schindler (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024469#comment-17024469
 ] 

Uwe Schindler commented on LUCENE-9020:
---

I was able to set everything up. It was a bit more coplicated, as AliasMatch 
and absolute path names are not allowed in .htaccess files (which is a per 
directory config, so actual directory is resolved). See INFRA-19439 for more 
details.

We solved the issue mostly over the Slack Channel. In short what we did with 
Daniel:

"Alias" and "AliasMatch" does not work in ".htaccess" (which is a per-directory 
config and therefore the file system patch is already found out, so it's to 
late to apply aliases. Aliases only work on server config or location). The 
workaround is to use "mod-rewrite". The fix consists of 2 separate parts:

- INFRA added an Alias/Rewrite on their side in the global server config that 
can be used by all project server. It maps URI path "/__root" to the filesystem 
path where all project webpages are hosted: See this initial commit: 
https://github.com/apache/infrastructure-p6/compare/a63511b7499f...63e23e52b18b
- Lucene/Solr changed their .htaccess to use rewrite directives that just 
rewrite the above URLs to "/__root/old-svn-website.../". This makes it 
independent from real filesystem paths. We (Lucene) just know that below the 
URI path "/__root" we can reach all project folders that are checked out on wb 
server. Only backside: You can theoretically reach every website by 
hand-crafting an URL like 
https://lucene.apache.org/__root/someotherproject/somehtml. With nginx as 
webserver you could define this URI path as "internal", but Apache HTTPD does 
not have this notion. Nginx uses this "internal" notion for resource endpoints 
only accessible by rewrites. Our htaccess now looks like this: 
https://github.com/apache/lucene-site/blob/3fa9933b276897f89525c61301d9e4e2da863b85/content/.htaccess#L121-L125

We did some tests with checking out part of the SVN tree on the staging 
machine. But the whole rewrite generally only works in production (which is not 
different to our old website, as the old CMS was also not showing our javadocs).

The final step is to bring the website to production. We may need some more 
help once this will be done, as we cannot guarantee that all works perfect in 
production (we hope so).

> Find a way to publish Solr RefGuide and Javadocs without checking into git
> --
>
> Key: LUCENE-9020
> URL: https://issues.apache.org/jira/browse/LUCENE-9020
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Jan Høydahl
>Assignee: Uwe Schindler
>Priority: Major
>
> Currently we check in all versions of RefGuide (hundreds of small html files) 
> into svn to publish as part of the site. With new site we should find a 
> smoother way to do this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9164) Should not consider ACE a tragedy if IW is closed

2020-01-27 Thread Atri Sharma (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024461#comment-17024461
 ] 

Atri Sharma commented on LUCENE-9164:
-

I am in dual minds on this -- isnt trying to close an already closed 
IndexWriter a sign of a potentially fatal bug in the user code? This changes 
user facing behaviour (unless I am reading the patch wrong) so would want to 
understand the reasoning for this change.

> Should not consider ACE a tragedy if IW is closed
> -
>
> Key: LUCENE-9164
> URL: https://issues.apache.org/jira/browse/LUCENE-9164
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: master (9.0), 8.5, 8.4.2
>Reporter: Nhat Nguyen
>Assignee: Nhat Nguyen
>Priority: Major
> Attachments: LUCENE-9164.patch, LUCENE-9164.patch
>
>
> If IndexWriter is closed or being closed, AlreadyClosedException is expected. 
> We should not consider it a tragic event in this case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9179) gradle setupLocalDefaultsOnce can screw up on the first run

2020-01-27 Thread Dawid Weiss (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024460#comment-17024460
 ] 

Dawid Weiss commented on LUCENE-9179:
-

I"ll fix it to not run recursively; it'll just generate the defaults and 
continue with the build. It may be slower on the first run but it'll still 
print a message about it.

Looks like a bug in gradle task to me. 

> gradle setupLocalDefaultsOnce can screw up on the first run
> ---
>
> Key: LUCENE-9179
> URL: https://issues.apache.org/jira/browse/LUCENE-9179
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Assignee: Dawid Weiss
>Priority: Major
>
> To reproduce:
> {noformat}
> rm gradle.properties
> ./gradlew -p lucene test
> {noformat}
> It will fail with a strange error:
> {noformat}
> > Included build in /home/rmuir/workspace/lucene-solr/lucene has name 
> > 'lucene' which is the same as a project of the main build.
> {noformat}
> It makes me wonder if we should try to do this recursive build stuff at all 
> on the first time, or do it a different way (e.g. alternatives are to fail 
> build, or maybe simply invoke ./gradlew ourselves so that it also picks up 
> parallelism changes)? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9166) gradle build: test failures need stacktraces

2020-01-27 Thread Robert Muir (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024459#comment-17024459
 ] 

Robert Muir commented on LUCENE-9166:
-

The only use-case IMO is individual test debugging. You got a 
securityexception, and for some reason its unclear why, so you have to dig a 
bit deeper.

I haven't dug into the low level issues with gradle here, but its similar to 
other narrow use-cases such as wanting to use PrintCompilation or other such 
stuff in tests and get all the output without something trying to interpret it.

> gradle build: test failures need stacktraces
> 
>
> Key: LUCENE-9166
> URL: https://issues.apache.org/jira/browse/LUCENE-9166
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>Priority: Major
> Fix For: master (9.0)
>
> Attachments: LUCENE-9166.patch
>
>
> Test failures are missing the stacktrace. Worse yet, it tells you go to look 
> at a separate (very long) filename which also has no stacktrace :(
> I know gradle tries really hard to be quiet and not say anything, but when a 
> test fails, that isn't the time or place :)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9179) gradle setupLocalDefaultsOnce can screw up on the first run

2020-01-27 Thread Dawid Weiss (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024450#comment-17024450
 ] 

Dawid Weiss commented on LUCENE-9179:
-

We don't need a recursive build at all -- it will just generate defaults and 
continue to run. I only wanted to run recursively because I hoped the new 
machine-specific defaults would be picked up (I don't think they are, not in 
full).

> gradle setupLocalDefaultsOnce can screw up on the first run
> ---
>
> Key: LUCENE-9179
> URL: https://issues.apache.org/jira/browse/LUCENE-9179
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Priority: Major
>
> To reproduce:
> {noformat}
> rm gradle.properties
> ./gradlew -p lucene test
> {noformat}
> It will fail with a strange error:
> {noformat}
> > Included build in /home/rmuir/workspace/lucene-solr/lucene has name 
> > 'lucene' which is the same as a project of the main build.
> {noformat}
> It makes me wonder if we should try to do this recursive build stuff at all 
> on the first time, or do it a different way (e.g. alternatives are to fail 
> build, or maybe simply invoke ./gradlew ourselves so that it also picks up 
> parallelism changes)? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9174) Bump default gradle memory to 2g

2020-01-27 Thread Dawid Weiss (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024448#comment-17024448
 ] 

Dawid Weiss commented on LUCENE-9174:
-

That's right - that's what I had in mind. The daemon runs out of heap. The 
situation depends on what you're running; javadocs are computed within daemon's 
JVM I think; maybe with too many parallel threads it just explodes. I haven't 
looked at the problem closely yet.

> Bump default gradle memory to 2g
> 
>
> Key: LUCENE-9174
> URL: https://issues.apache.org/jira/browse/LUCENE-9174
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Major
>
> I see these from time to time so I'll bump the daemon's heap to 2 gigs. Don't 
> know why it needs to much...
> {code}
> Expiring Daemon because JVM heap space is exhausted
> Daemon will be stopped at the end of the build after running out of JVM memory
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (LUCENE-9180) newlines/gitattributes cleanup

2020-01-27 Thread Robert Muir (Jira)

Robert Muir created LUCENE-9180:
---

 Summary: newlines/gitattributes cleanup
 Key: LUCENE-9180
 URL: https://issues.apache.org/jira/browse/LUCENE-9180
 Project: Lucene - Core
  Issue Type: Task
  Components: general/build
Reporter: Robert Muir


merge the two .gitattributes files into a single one at the root, fix some 
random files with DOS newlines that don't need them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9166) gradle build: test failures need stacktraces

2020-01-27 Thread Dawid Weiss (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024442#comment-17024442
 ] 

Dawid Weiss commented on LUCENE-9166:
-

bq. I can't remember how it worked, but I feel like it was still using your 
junit runner to actually run the tests versus the built-in gradle test support?

Correct. They switched recently. There are benefits of using gradle's runner 
(better integration with  the rest of the infrastructure is one of them). I 
don't know how to solve it properly yet. Security and other JVM-level messaging 
is a very narrow area and not commonly used. Maybe we just need a dumb 
substitute for running these (they're typically individual tests anyway).

> gradle build: test failures need stacktraces
> 
>
> Key: LUCENE-9166
> URL: https://issues.apache.org/jira/browse/LUCENE-9166
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>Priority: Major
> Fix For: master (9.0)
>
> Attachments: LUCENE-9166.patch
>
>
> Test failures are missing the stacktrace. Worse yet, it tells you go to look 
> at a separate (very long) filename which also has no stacktrace :(
> I know gradle tries really hard to be quiet and not say anything, but when a 
> test fails, that isn't the time or place :)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9175) gradle build leaks tons of gradle-worker-classpath* files in tmpdir

2020-01-27 Thread Robert Muir (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024438#comment-17024438
 ] 

Robert Muir commented on LUCENE-9175:
-

Gradle creates the file here: 
https://github.com/gradle/gradle/blob/b7f79aa9b29cd6ad7fb9f189dceb0311ef7b6bfd/subprojects/core/src/main/java/org/gradle/process/internal/worker/child/ApplicationClassesInSystemClassLoaderWorkerImplementationFactory.java#L96

> gradle build leaks tons of gradle-worker-classpath* files in tmpdir
> ---
>
> Key: LUCENE-9175
> URL: https://issues.apache.org/jira/browse/LUCENE-9175
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Priority: Major
>
> This may be a sign of classloader issues or similar that cause other issues 
> like LUCENE-9174?
> {noformat}
> $ ls /tmp/gradle-worker-classpath* | wc -l
> 523
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9166) gradle build: test failures need stacktraces

2020-01-27 Thread Robert Muir (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024435#comment-17024435
 ] 

Robert Muir commented on LUCENE-9166:
-

Interesting. I feel like this worked for the elasticsearch gradle build a long 
time ago. I can't remember how it worked, but I feel like it was still using 
your junit runner to actually run the tests versus the built-in gradle test 
support? Maybe it would solve several of our problems?

> gradle build: test failures need stacktraces
> 
>
> Key: LUCENE-9166
> URL: https://issues.apache.org/jira/browse/LUCENE-9166
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>Priority: Major
> Fix For: master (9.0)
>
> Attachments: LUCENE-9166.patch
>
>
> Test failures are missing the stacktrace. Worse yet, it tells you go to look 
> at a separate (very long) filename which also has no stacktrace :(
> I know gradle tries really hard to be quiet and not say anything, but when a 
> test fails, that isn't the time or place :)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9166) gradle build: test failures need stacktraces

2020-01-27 Thread Dawid Weiss (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024429#comment-17024429
 ] 

Dawid Weiss commented on LUCENE-9166:
-

Correct. It is dumb. The API is there to provide a different ordering but it's 
all internal so I don't know if it makes sense to waste cycles right now to try 
to fix it. What worries me more is LUCENE-9120: this is something we will 
probably need sooner or later. I don't think there is an easy workaround inside 
gradle itself. It's more likely we'll have to redirect to ant or implement a 
custom java launcher for such corner-cases (which isn't a big deal but requires 
some coding).

> gradle build: test failures need stacktraces
> 
>
> Key: LUCENE-9166
> URL: https://issues.apache.org/jira/browse/LUCENE-9166
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>Priority: Major
> Fix For: master (9.0)
>
> Attachments: LUCENE-9166.patch
>
>
> Test failures are missing the stacktrace. Worse yet, it tells you go to look 
> at a separate (very long) filename which also has no stacktrace :(
> I know gradle tries really hard to be quiet and not say anything, but when a 
> test fails, that isn't the time or place :)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] gerlowskija commented on issue #1171: SOLR-13892: Add 'top-level' docValues Join implementation

2020-01-27 Thread GitBox

gerlowskija commented on issue #1171: SOLR-13892: Add 'top-level' docValues 
Join implementation
URL: https://github.com/apache/lucene-solr/pull/1171#issuecomment-578801824
 
 
   > In JoinQuery.rewrite I see it says explicitly "don't rewrite the subQuery" 
but why not? If it never gets rewritten then that's a bug.
   
   That's a question for the original Join author maybe?  I'm not familiar 
enough with how rewrites work to speak to it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] gerlowskija commented on a change in pull request #1171: SOLR-13892: Add 'top-level' docValues Join implementation

2020-01-27 Thread GitBox

gerlowskija commented on a change in pull request #1171: SOLR-13892: Add 
'top-level' docValues Join implementation
URL: https://github.com/apache/lucene-solr/pull/1171#discussion_r371286866
 
 

 ##
 File path: solr/core/src/java/org/apache/solr/search/JoinQParserPlugin.java
 ##
 @@ -59,67 +60,124 @@
 import org.apache.solr.search.join.ScoreJoinQParserPlugin;
 import org.apache.solr.util.RTimer;
 import org.apache.solr.util.RefCounted;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
 
 public class JoinQParserPlugin extends QParserPlugin {
+  private static final Logger log = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
   public static final String NAME = "join";
+  /** Choose the internal algorithm */
+  private static final String METHOD = "method";
+
+  private static class JoinParams {
+final String fromField;
+final String fromCore;
+final Query fromQuery;
+final long fromCoreOpenTime;
+final String toField;
+
+public JoinParams(String fromField, String fromCore, Query fromQuery, long 
fromCoreOpenTime, String toField) {
+  this.fromField = fromField;
+  this.fromCore = fromCore;
+  this.fromQuery = fromQuery;
+  this.fromCoreOpenTime = fromCoreOpenTime;
+  this.toField = toField;
+}
+  }
+
+  private enum Method {
+index {
+  @Override
+  Query makeFilter(QParser qparser) throws SyntaxError {
+final JoinParams jParams = parseJoin(qparser);
+final JoinQuery q = new JoinQuery(jParams.fromField, jParams.toField, 
jParams.fromCore, jParams.fromQuery);
+q.fromCoreOpenTime = jParams.fromCoreOpenTime;
+return q;
+  }
+},
+dvWithScore {
+  @Override
+  Query makeFilter(QParser qparser) throws SyntaxError {
+return new ScoreJoinQParserPlugin().createParser(qparser.qstr, 
qparser.localParams, qparser.params, qparser.req).parse();
+  }
+},
+topLevelDV {
+  @Override
+  Query makeFilter(QParser qparser) throws SyntaxError {
+final JoinParams jParams = parseJoin(qparser);
+final JoinQuery q = new TopLevelJoinQuery(jParams.fromField, 
jParams.toField, jParams.fromCore, jParams.fromQuery);
+q.fromCoreOpenTime = jParams.fromCoreOpenTime;
+return q;
+  }
+};
+
+abstract Query makeFilter(QParser qparser) throws SyntaxError;
+
+JoinParams parseJoin(QParser qparser) throws SyntaxError {
+  final String fromField = qparser.getParam("from");
+  final String fromIndex = qparser.getParam("fromIndex");
+  final String toField = qparser.getParam("to");
+  final String v = qparser.localParams.get("v");
+  final String coreName;
+
+  Query fromQuery;
+  long fromCoreOpenTime = 0;
+
+  if (fromIndex != null && 
!fromIndex.equals(qparser.req.getCore().getCoreDescriptor().getName()) ) {
+CoreContainer container = qparser.req.getCore().getCoreContainer();
+
+// if in SolrCloud mode, fromIndex should be the name of a 
single-sharded collection
+coreName = ScoreJoinQParserPlugin.getCoreName(fromIndex, container);
+
+final SolrCore fromCore = container.getCore(coreName);
+if (fromCore == null) {
+  throw new SolrException(SolrException.ErrorCode.BAD_REQUEST,
+  "Cross-core join: no such core " + coreName);
+}
+
+RefCounted fromHolder = null;
+LocalSolrQueryRequest otherReq = new LocalSolrQueryRequest(fromCore, 
qparser.params);
+try {
 
 Review comment:
   Totally agree - should be using try-with-resources here.
   
   But I'm reluctant to introduce changes here that aren't strictly necessary.  
(Github shows this section as "added", but really it was just moved from 
elsewhere in the file.)
   
   My opinion on this changes, but I've been burned too many times recently by 
adding a "harmless" refactor into a related commit, only for that to cause 
issues later that force a revert.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] gerlowskija commented on a change in pull request #1171: SOLR-13892: Add 'top-level' docValues Join implementation

2020-01-27 Thread GitBox

gerlowskija commented on a change in pull request #1171: SOLR-13892: Add 
'top-level' docValues Join implementation
URL: https://github.com/apache/lucene-solr/pull/1171#discussion_r371296171
 
 

 ##
 File path: 
solr/core/src/java/org/apache/solr/search/join/MultiValueTermOrdinalCollector.java
 ##
 @@ -0,0 +1,72 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.search.join;
+
+import java.io.IOException;
+import java.lang.invoke.MethodHandles;
+
+import org.apache.lucene.index.LeafReaderContext;
+import org.apache.lucene.index.SortedSetDocValues;
+import org.apache.lucene.search.ScoreMode;
+import org.apache.lucene.util.LongBitSet;
+import org.apache.solr.search.DelegatingCollector;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Populates a bitset of (top-level) ordinals based on field values in a 
multi-valued field.
+ */
+public class MultiValueTermOrdinalCollector extends DelegatingCollector {
 
 Review comment:
   Because it saves me from reimplementing `getLeafCollector()` and some other 
methods.  If you're wondering why DelegatingCollector as opposed to 
SimpleCollector or other options, there's not a great answer - 
DelegatingCollector was needed in some earlier revision when things were 
postfilter based.  I've changed it to use SimpleCollector; hopefully that 
addresses your concern.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14193) Update tutorial.adoc(line no:664) so that command executes in windows enviroment

2020-01-27 Thread Cassandra Targett (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024396#comment-17024396
 ] 

Cassandra Targett commented on SOLR-14193:
--

OK, that makes sense. Looking forward to your new PR & thanks for your help!

> Update tutorial.adoc(line no:664) so that command executes in windows 
> enviroment
> 
>
> Key: SOLR-14193
> URL: https://issues.apache.org/jira/browse/SOLR-14193
> Project: Solr
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 8.4
>Reporter: balaji sundaram
>Priority: Minor
>
>  
> {{When executing the following command in windows 10 "java -jar -Dc=films 
> -Dparams=f.genre.split=true_by.split=true=|_by.separator=|
>  -Dauto example\exampledocs\post.jar example\films\*.csv", it throws error "& 
> was unexpected at this time."}}
> Fix: the command should escape "&" and "|" symbol{{}}
> {{}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9179) gradle setupLocalDefaultsOnce can screw up on the first run

2020-01-27 Thread Robert Muir (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024395#comment-17024395
 ] 

Robert Muir commented on LUCENE-9179:
-

I know [~mikemccand] hit this trying to use gradle in luceneutils (with a clean 
checkout like a CI tool might do). As a workaround i suggested he run 
{{./gradlew help}} first so that it generates the properties file, then run 
again.


> gradle setupLocalDefaultsOnce can screw up on the first run
> ---
>
> Key: LUCENE-9179
> URL: https://issues.apache.org/jira/browse/LUCENE-9179
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Robert Muir
>Priority: Major
>
> To reproduce:
> {noformat}
> rm gradle.properties
> ./gradlew -p lucene test
> {noformat}
> It will fail with a strange error:
> {noformat}
> > Included build in /home/rmuir/workspace/lucene-solr/lucene has name 
> > 'lucene' which is the same as a project of the main build.
> {noformat}
> It makes me wonder if we should try to do this recursive build stuff at all 
> on the first time, or do it a different way (e.g. alternatives are to fail 
> build, or maybe simply invoke ./gradlew ourselves so that it also picks up 
> parallelism changes)? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (LUCENE-9179) gradle setupLocalDefaultsOnce can screw up on the first run

2020-01-27 Thread Robert Muir (Jira)

Robert Muir created LUCENE-9179:
---

 Summary: gradle setupLocalDefaultsOnce can screw up on the first 
run
 Key: LUCENE-9179
 URL: https://issues.apache.org/jira/browse/LUCENE-9179
 Project: Lucene - Core
  Issue Type: Task
Reporter: Robert Muir


To reproduce:

{noformat}
rm gradle.properties
./gradlew -p lucene test
{noformat}

It will fail with a strange error:
{noformat}
> Included build in /home/rmuir/workspace/lucene-solr/lucene has name 'lucene' 
> which is the same as a project of the main build.
{noformat}

It makes me wonder if we should try to do this recursive build stuff at all on 
the first time, or do it a different way (e.g. alternatives are to fail build, 
or maybe simply invoke ./gradlew ourselves so that it also picks up parallelism 
changes)? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Resolved] (SOLR-14220) Unable to build 7_7 or 8_4 due to missing dependency

2020-01-27 Thread Cassandra Targett (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cassandra Targett resolved SOLR-14220.
--
Resolution: Duplicate

This appears to be a duplicate of LUCENE-9170; closing this in favor of that 
one.

> Unable to build 7_7 or 8_4 due to missing dependency
> 
>
> Key: SOLR-14220
> URL: https://issues.apache.org/jira/browse/SOLR-14220
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Build
>Affects Versions: 7.7, 8.4
>Reporter: Karl Stoney
>Priority: Major
>  Labels: build, build-failure
>
> Attempting to build from:
> 7_7:
> https://github.com/apache/lucene-solr/commit/7a309c21ebbc1b08d9edf67802b63fc0bc7affcf
> or
> 8_4:
> https://github.com/apache/lucene-solr/commit/7d3ac7c284b26ce62f41d3b8686f70c7d6bd758d
> Results in the same build failure:
> {code:java}
> BUILD FAILED
> /usr/local/autotrader/app/lucene-solr/solr/build.xml:685: The following error 
> occurred while executing this line:
> /usr/local/autotrader/app/lucene-solr/solr/build.xml:656: The following error 
> occurred while executing this line:
> /usr/local/autotrader/app/lucene-solr/lucene/common-build.xml:653: Error 
> downloading wagon provider from the remote repository: Missing:
> --
> 1) org.apache.maven.wagon:wagon-ssh:jar:1.0-beta-7
>   Try downloading the file manually from the project website.
>   Then, install it using the command: 
>   mvn install:install-file -DgroupId=org.apache.maven.wagon 
> -DartifactId=wagon-ssh -Dversion=1.0-beta-7 -Dpackaging=jar 
> -Dfile=/path/to/file
>   Alternatively, if you host your own repository you can deploy the file 
> there: 
>   mvn deploy:deploy-file -DgroupId=org.apache.maven.wagon 
> -DartifactId=wagon-ssh -Dversion=1.0-beta-7 -Dpackaging=jar 
> -Dfile=/path/to/file -Durl=[url] -DrepositoryId=[id]
>   Path to dependency: 
>   1) unspecified:unspecified:jar:0.0
>   2) org.apache.maven.wagon:wagon-ssh:jar:1.0-beta-7
> --
> 1 required artifact is missing.
> for artifact: 
>   unspecified:unspecified:jar:0.0
> from the specified remote repositories:
>   central (http://repo1.maven.org/maven2)
> {code}
> Previously building 7_7 from 3aad3311a97256a8537dd04165c67edcce1c153c, and 
> 8_4 from c0b96fd305946b2564b967272e6e23c59ab0b5da worked fine.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-10665) POC for a PF4J based plugin system

2020-01-27 Thread David Smiley (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-10665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley updated SOLR-10665:

Description: 
In SOLR-5103 we have been discussing improvements to Solr plugin system, with 
ability to bundle a plugin as zip, and easily install from shell or Admin UI.

This task aims to create a working POC to demonstrate how PF4J (Plugin 
Framework4J) can be used to bring a very simple plugin packaging and 
installation system to Solr with a minimum of effort. Code speaks louder than 
words :)

The POC effort is a quite large patch and will be cutting some corners to get 
the feature in the hands of people who can test and evaluate. If there is 
consensus to add this to Solr, there will be other sub tasks to split up the 
elephant into committable chunks.

The design document is located here: [https://s.apache.org/solr-plugin] (Google 
Doc) - comments are welcome in the document or here.

  was:
In SOLR-5103 we have been discussing improvements to Solr plugin system, with 
ability to bundle a plugin as zip, and easily install from shell or Admin UI.

This task aims to create a working POC to demonstrate how PF4J (Plugin 
Framework4J) can be used to bring a very simple plugin packaging and 
installation system to Solr with a minimum of effort. Code speaks louder than 
words :)

The POC effort is a quite large patch and will be cutting some corners to get 
the feature in the hands of people who can test and evaluate. If there is 
consensus to add this to Solr, there will be other sub tasks to split up the 
elephant into committable chunks.

The design document is located here: https://s.apache.org/solr-plugin (Google 
Doc) - comments are welcome in the document or here.


> POC for a PF4J based plugin system
> --
>
> Key: SOLR-10665
> URL: https://issues.apache.org/jira/browse/SOLR-10665
> Project: Solr
>  Issue Type: New Feature
>  Components: Plugin system
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
>  Labels: pf4j, plugins
> Attachments: SOLR-10665.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> In SOLR-5103 we have been discussing improvements to Solr plugin system, with 
> ability to bundle a plugin as zip, and easily install from shell or Admin UI.
> This task aims to create a working POC to demonstrate how PF4J (Plugin 
> Framework4J) can be used to bring a very simple plugin packaging and 
> installation system to Solr with a minimum of effort. Code speaks louder than 
> words :)
> The POC effort is a quite large patch and will be cutting some corners to get 
> the feature in the hands of people who can test and evaluate. If there is 
> consensus to add this to Solr, there will be other sub tasks to split up the 
> elephant into committable chunks.
> The design document is located here: [https://s.apache.org/solr-plugin] 
> (Google Doc) - comments are welcome in the document or here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9166) gradle build: test failures need stacktraces

2020-01-27 Thread Robert Muir (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024377#comment-17024377
 ] 

Robert Muir commented on LUCENE-9166:
-

LOL thanks, I am not digging too far yet: trying to balance time also fixing 
slow tests. It seems the gradle load balancing is quite a bit dumber than the 
junit4-work-stealing we had before. It makes up for it somewhat by 
parallelizing across modules but we have some big fat ones that can bottleneck 
builds. There is probably an easy win here...

> gradle build: test failures need stacktraces
> 
>
> Key: LUCENE-9166
> URL: https://issues.apache.org/jira/browse/LUCENE-9166
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>Priority: Major
> Fix For: master (9.0)
>
> Attachments: LUCENE-9166.patch
>
>
> Test failures are missing the stacktrace. Worse yet, it tells you go to look 
> at a separate (very long) filename which also has no stacktrace :(
> I know gradle tries really hard to be quiet and not say anything, but when a 
> test fails, that isn't the time or place :)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (LUCENE-9178) Use run-length encoding when writing docIds in BKD tree

2020-01-27 Thread Ignacio Vera (Jira)

Ignacio Vera created LUCENE-9178:


 Summary: Use run-length encoding when writing docIds in BKD tree
 Key: LUCENE-9178
 URL: https://issues.apache.org/jira/browse/LUCENE-9178
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Ignacio Vera


I think we can easily check if it make sense to write docIds using length 
compression in the BKD tree. This can probably save some space in the case of 
Muti value documents, e.g LatLonShape and XYShape.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9166) gradle build: test failures need stacktraces

2020-01-27 Thread Dawid Weiss (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024363#comment-17024363
 ] 

Dawid Weiss commented on LUCENE-9166:
-

No worries, I didn't get that impression. :) As for debugging gradle - welcome 
to the club. The more I am involved in those complex gradle builds the more 
schizophrenic I become about them. One moment you're in awe and they're the 
greatest thing, the next you're debugging or digging through source to figure 
out what's wrong. 

> gradle build: test failures need stacktraces
> 
>
> Key: LUCENE-9166
> URL: https://issues.apache.org/jira/browse/LUCENE-9166
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>Priority: Major
> Fix For: master (9.0)
>
> Attachments: LUCENE-9166.patch
>
>
> Test failures are missing the stacktrace. Worse yet, it tells you go to look 
> at a separate (very long) filename which also has no stacktrace :(
> I know gradle tries really hard to be quiet and not say anything, but when a 
> test fails, that isn't the time or place :)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9173) SynonymGraphFilter doesn't correctly consume decompounded tokens (branched token graph)

2020-01-27 Thread Michael McCandless (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024358#comment-17024358
 ] 

Michael McCandless commented on LUCENE-9173:


Yeah this is a known tricky issue for both {{SynonymFilter}} and 
{{SynonymGraphFilter}} (though, maybe the former does not throw an exception if 
you feed it a graph?).

Also, note that the exception above is while building the {{SynonymMap}} and 
not while actually tokenizing.  You could successfully build a {{SynonymMap}} 
but then if you feed a graph to {{SynonymGraphFilter}} I think it detects that 
and throws an exception then.

It is possible to fix this – it's just software! – it's just rather tricky for 
{{SynonymGraphFilter}} to find matches in an incoming graph, and in general 
could become quite costly in adversarial cases of e.g. high numbers of tokens 
at the same position in the input graph.

> SynonymGraphFilter doesn't correctly consume decompounded tokens  (branched 
> token graph)
> 
>
> Key: LUCENE-9173
> URL: https://issues.apache.org/jira/browse/LUCENE-9173
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/analysis
>Reporter: Tomoko Uchida
>Priority: Minor
>
> This is a derived issue from LUCENE-9123.
> When the tokenizer that is given to SynonymGraphFilter decompound tokens or 
> emit multiple tokens at the same position, SynonymGraphFilter cannot 
> correctly handle them (an exception will be thrown).
> For example, JapaneseTokenizer (mode=SEARCH) would emit a token and two 
> decompounded tokens for the text "株式会社":
> {code:java}
> 株式会社 (positionIncrement=0, positionLength=2)
> 株式 (positionIncrement=1, positionLength=1)
> 会社 (positionIncrement=1, positionLength=1)
> {code}
> Then if we give a synonym "株式会社,コーポレーション" by SynonymGraphFilterFactory (set 
> tokenizerFactory=JapaneseTokenizerFactory) this exception is thrown.
> {code:java}
> Caused by: java.lang.IllegalArgumentException: term: 株式会社 analyzed to a token 
> (株式会社) with position increment != 1 (got: 0)
>   at 
> org.apache.lucene.analysis.synonym.SynonymMap$Parser.analyze(SynonymMap.java:325)
>  ~[lucene-analyzers-common-8.4.0.jar:8.4.0 
> bc02ab906445fcf4e297f4ef00ab4a54fdd72ca2 - jpountz - 2019-12-19 20:16:38]
>   at 
> org.apache.lucene.analysis.synonym.SolrSynonymParser.addInternal(SolrSynonymParser.java:114)
>  ~[lucene-analyzers-common-8.4.0.jar:8.4.0 
> bc02ab906445fcf4e297f4ef00ab4a54fdd72ca2 - jpountz - 2019-12-19 20:16:38]
>   at 
> org.apache.lucene.analysis.synonym.SolrSynonymParser.parse(SolrSynonymParser.java:70)
>  ~[lucene-analyzers-common-8.4.0.jar:8.4.0 
> bc02ab906445fcf4e297f4ef00ab4a54fdd72ca2 - jpountz - 2019-12-19 20:16:38]
>   at 
> org.apache.lucene.analysis.synonym.SynonymGraphFilterFactory.loadSynonyms(SynonymGraphFilterFactory.java:179)
>  ~[lucene-analyzers-common-8.4.0.jar:8.4.0 
> bc02ab906445fcf4e297f4ef00ab4a54fdd72ca2 - jpountz - 2019-12-19 20:16:38]
>   at 
> org.apache.lucene.analysis.synonym.SynonymGraphFilterFactory.inform(SynonymGraphFilterFactory.java:154)
>  ~[lucene-analyzers-common-8.4.0.jar:8.4.0 
> bc02ab906445fcf4e297f4ef00ab4a54fdd72ca2 - jpountz - 2019-12-19 20:16:38]
> {code}
> This isn't only limited to JapaneseTokenizer but a more general issue about 
> handling branched token graph (decompounded tokens in the midstream).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9166) gradle build: test failures need stacktraces

2020-01-27 Thread Robert Muir (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024351#comment-17024351
 ] 

Robert Muir commented on LUCENE-9166:
-

I didn't mean to give the impression this bug was your fault. It took some 
digging for me to figure out WTF was happening because I didn't see anything 
configured to filter out traces at all. I had to add system.out.printlns to 
figure out the Set actually had a filter in it by default put there by gradle, 
and what it was doing, etc...

> gradle build: test failures need stacktraces
> 
>
> Key: LUCENE-9166
> URL: https://issues.apache.org/jira/browse/LUCENE-9166
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>Priority: Major
> Fix For: master (9.0)
>
> Attachments: LUCENE-9166.patch
>
>
> Test failures are missing the stacktrace. Worse yet, it tells you go to look 
> at a separate (very long) filename which also has no stacktrace :(
> I know gradle tries really hard to be quiet and not say anything, but when a 
> test fails, that isn't the time or place :)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9166) gradle build: test failures need stacktraces

2020-01-27 Thread Dawid Weiss (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024349#comment-17024349
 ] 

Dawid Weiss commented on LUCENE-9166:
-

Let's leave the stack trace in full. I don't think it harms anyone: the build 
is much less chatty anyway and when a failure happens the stack trace is a 
fairly important. We can always trim it later.

As for upgrading gradle... Seems like anything I touch recently turns out to 
have bugs in it so I'm careful with upgrades if something is working. ;) But 
feel free to try it out - altering gradle/wrapper/gradle-wrapper.properties 
should do the trick.

> gradle build: test failures need stacktraces
> 
>
> Key: LUCENE-9166
> URL: https://issues.apache.org/jira/browse/LUCENE-9166
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>Priority: Major
> Fix For: master (9.0)
>
> Attachments: LUCENE-9166.patch
>
>
> Test failures are missing the stacktrace. Worse yet, it tells you go to look 
> at a separate (very long) filename which also has no stacktrace :(
> I know gradle tries really hard to be quiet and not say anything, but when a 
> test fails, that isn't the time or place :)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-8143) Remove SpanBoostQuery

2020-01-27 Thread David Smiley (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024346#comment-17024346
 ] 

David Smiley commented on LUCENE-8143:
--

The fact that SpanBoostQuery only works at the top appears to not to be a fault 
of it's own; this is a very simple / straight-forward Query.  Instead the fault 
/ limitation seems to be in SpanWeight or somewhere around there.  Killing 
SpanBoostQuery is a red herring then; no?

> Remove SpanBoostQuery
> -
>
> Key: LUCENE-8143
> URL: https://issues.apache.org/jira/browse/LUCENE-8143
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Adrien Grand
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I initially added it so that span queries could still be boosted, but this 
> was actually a mistake: boosts are ignored on inner span queries, only the 
> boost of the top-level span query, the one that performs scoring, is not 
> ignored.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9166) gradle build: test failures need stacktraces

2020-01-27 Thread Robert Muir (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024344#comment-17024344
 ] 

Robert Muir commented on LUCENE-9166:
-

[~dweiss] I think this is actually working around 
https://github.com/gradle/gradle/issues/11220 which looks like it was fixed in 
6.1.1

Still I am skeptical of the filtering :) But alternatively we could revert this 
commit and upgrade and it would solve at least the particular problem that I 
had here. Looks like the fix simply checks for where the filter would remove 
the whole stacktrace completely...

> gradle build: test failures need stacktraces
> 
>
> Key: LUCENE-9166
> URL: https://issues.apache.org/jira/browse/LUCENE-9166
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>Priority: Major
> Fix For: master (9.0)
>
> Attachments: LUCENE-9166.patch
>
>
> Test failures are missing the stacktrace. Worse yet, it tells you go to look 
> at a separate (very long) filename which also has no stacktrace :(
> I know gradle tries really hard to be quiet and not say anything, but when a 
> test fails, that isn't the time or place :)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9177) ICUNormalizer2CharFilter worst case is very slow

2020-01-27 Thread Jim Ferenczi (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024336#comment-17024336
 ] 

Jim Ferenczi commented on LUCENE-9177:
--

They use the `kuromoji` tokenizer so I think there's some value to apply NFKC 
as a char filter ?

> ICUNormalizer2CharFilter worst case is very slow
> 
>
> Key: LUCENE-9177
> URL: https://issues.apache.org/jira/browse/LUCENE-9177
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Jim Ferenczi
>Priority: Minor
> Attachments: lucene.patch
>
>
> ICUNormalizer2CharFilter is fast most of the times but we've had some report 
> in Elasticsearch that some unrealistic data can slow down the process very 
> significantly. For instance an input that consists of characters to normalize 
> with no normalization-inert character in between can take up to several 
> seconds to process few hundreds of kilo-bytes on my machine. While the input 
> is not realistic, this worst case can slow down indexing considerably when 
> dealing with uncleaned data.
> I attached a small test that reproduces the slow processing using a stream 
> that contains a lot of repetition of the character `℃` and no 
> normalization-inert character. I am not surprised that the processing is 
> slower than usual but several seconds to process seems a lot. Adding 
> normalization-inert character makes the processing a lot more faster so I 
> wonder if we can improve the process to split the input more eagerly ?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9177) ICUNormalizer2CharFilter worst case is very slow

2020-01-27 Thread Robert Muir (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024333#comment-17024333
 ] 

Robert Muir commented on LUCENE-9177:
-

If they are just doing NFKC, then normalization won't impact most tokenizers 
(standard, icu) so just use the tokenfilter instead? it doesn't have these 
issues.

The charfilter should only be needed to try to "cleanup" for tokenizers that 
don't understand unicode, so that they will then tokenize properly.

> ICUNormalizer2CharFilter worst case is very slow
> 
>
> Key: LUCENE-9177
> URL: https://issues.apache.org/jira/browse/LUCENE-9177
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Jim Ferenczi
>Priority: Minor
> Attachments: lucene.patch
>
>
> ICUNormalizer2CharFilter is fast most of the times but we've had some report 
> in Elasticsearch that some unrealistic data can slow down the process very 
> significantly. For instance an input that consists of characters to normalize 
> with no normalization-inert character in between can take up to several 
> seconds to process few hundreds of kilo-bytes on my machine. While the input 
> is not realistic, this worst case can slow down indexing considerably when 
> dealing with uncleaned data.
> I attached a small test that reproduces the slow processing using a stream 
> that contains a lot of repetition of the character `℃` and no 
> normalization-inert character. I am not surprised that the processing is 
> slower than usual but several seconds to process seems a lot. Adding 
> normalization-inert character makes the processing a lot more faster so I 
> wonder if we can improve the process to split the input more eagerly ?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

1 2 >

1 - 100 of 163 matches

Mail list logo