date:20200423

[jira] [Resolved] (LUCENE-9344) Convert XXX.txt files to proper XXX.md

2020-04-23 Thread Tomoko Uchida (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomoko Uchida resolved LUCENE-9344.
---
Fix Version/s: master (9.0)
   Resolution: Fixed

> Convert  XXX.txt files to proper XXX.md
> ---
>
> Key: LUCENE-9344
> URL: https://issues.apache.org/jira/browse/LUCENE-9344
> Project: Lucene - Core
>  Issue Type: Improvement
>Affects Versions: master (9.0)
>Reporter: Tomoko Uchida
>Assignee: Tomoko Uchida
>Priority: Minor
> Fix For: master (9.0)
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Text files that are (partially) written in markdown (such as "README.txt") 
> can be converted to proper markdown files. This change was suggested on 
> LUCENE-9321.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9344) Convert XXX.txt files to proper XXX.md

2020-04-23 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17091217#comment-17091217
 ] 

ASF subversion and git services commented on LUCENE-9344:
-

Commit 75b648ce828f1131824330adc14a5ae1f850bc35 in lucene-solr's branch 
refs/heads/master from Tomoko Uchida
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=75b648c ]

LUCENE-9344: Use https url for lucene.apache.org


> Convert  XXX.txt files to proper XXX.md
> ---
>
> Key: LUCENE-9344
> URL: https://issues.apache.org/jira/browse/LUCENE-9344
> Project: Lucene - Core
>  Issue Type: Improvement
>Affects Versions: master (9.0)
>Reporter: Tomoko Uchida
>Assignee: Tomoko Uchida
>Priority: Minor
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Text files that are (partially) written in markdown (such as "README.txt") 
> can be converted to proper markdown files. This change was suggested on 
> LUCENE-9321.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] mocobeta commented on pull request #1449: LUCENE-9344: Convert XXX.txt files to proper XXX.md

2020-04-23 Thread GitBox



mocobeta commented on pull request #1449:
URL: https://github.com/apache/lucene-solr/pull/1449#issuecomment-618810502


   Thank you for reviewing, I just merge it on the master.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9344) Convert XXX.txt files to proper XXX.md

2020-04-23 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17091213#comment-17091213
 ] 

ASF subversion and git services commented on LUCENE-9344:
-

Commit c7697b088c955c9bcbd489145b396f1540c584d6 in lucene-solr's branch 
refs/heads/master from Tomoko Uchida
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=c7697b0 ]

LUCENE-9344: Convert .txt files to properly formatted .md files (#1449)



> Convert  XXX.txt files to proper XXX.md
> ---
>
> Key: LUCENE-9344
> URL: https://issues.apache.org/jira/browse/LUCENE-9344
> Project: Lucene - Core
>  Issue Type: Improvement
>Affects Versions: master (9.0)
>Reporter: Tomoko Uchida
>Assignee: Tomoko Uchida
>Priority: Minor
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Text files that are (partially) written in markdown (such as "README.txt") 
> can be converted to proper markdown files. This change was suggested on 
> LUCENE-9321.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14423) static caches in StreamHandler ought to move to CoreContainer lifecycle

2020-04-23 Thread David Smiley (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17091198#comment-17091198
 ] 

David Smiley commented on SOLR-14423:
-

Great feedback AB.  I agree on the dependency injection framework point.

> static caches in StreamHandler ought to move to CoreContainer lifecycle
> ---
>
> Key: SOLR-14423
> URL: https://issues.apache.org/jira/browse/SOLR-14423
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: streaming expressions
>Reporter: David Smiley
>Priority: Major
>
> StreamHandler (at "/stream") has several statically declared caches.  I think 
> this is problematic, such as in testing wherein multiple nodes could be in 
> the same JVM.  One of them is more serious -- SolrClientCache which is 
> closed/cleared via a SolrCore close hook.  That's bad for performance but 
> also dangerous since another core might want to use one of these clients!
> CC [~jbernste]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14434) Multiterm Analyzer Not Persisted in Managed Schema

2020-04-23 Thread Trey Grainger (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Trey Grainger updated SOLR-14434:
-
Description: 
In addition to "{{index}}" and "{{query}}" analyzers, Solr supports adding an 
explicit "{{multiterm}}" analyzer to schema {{fieldType}} definitions. This 
allows for specific control over analysis for things like wildcard terms, 
prefix queries, range queries, etc. For example, the following would cause the 
wildcard query for "{{hats*}}" to get stemmed to "{{hat*}}" instead of 
"{{hats*}}", and thus match on the indexed version of "{{hat}}".
{code:java}
  

  
  
  


  
  
  
  


  
  
  

  {code}
This works fine if using a non-managed schema (i.e. {{schema.xml}} file) OR if 
you use managed schema (i.e. {{managed-schema}} file) and push your schema 
directly to Zookeeper. However, starting with Solr 8.0, if you use the Schema 
API to add a {{fieldType}}, the {{multiterm}} analyzers are not persisted (only 
{{index}} and {{query}} analyzers are).

This bug seems to have originated from LUCENE-8497, which refactored this code 
area substantially. The bug is caused by the managed schema being able to READ 
in the {{multiterm}} analyzers from the schema file, but then being unable to 
write them out. Since pushing the schema directly to Zookeeper only requires 
Solr reading them in, this bug would not have been obvious in initial testing. 
However, since the schema API reads in the schema file, writes an updated 
schema out to Zookeeper (where the bug occurs), and then reads the file back 
in, all of the {{multiTerm}} analyzers get stripped out.

I've identified the problematic code and am looking into an appropriate fix.

  was:
In addition to "{{index}}" and "{{query}}" analyzers, Solr supports adding an 
explicit "{{multiterm}}" analyzer to schema f\{{ieldType}} definitions. This 
allows for specific control over analysis for things like wildcard terms, 
prefix queries, range queries, etc. For example, the following would cause the 
wildcard query for "{{hats*}}" to get stemmed to "{{hat*}}" instead of 
"{{hats*}}", and thus match on the indexed version of "{{hat}}".
{code:java}
  

  
  
  


  
  
  
  


  
  
  

  {code}
This works fine if using a non-managed schema (i.e. {{schema.xml}} file) OR if 
you use managed schema (i.e. {{managed-schema}} file) and push your schema 
directly to Zookeeper. However, starting with Solr 8.0, if you use the Schema 
API to add a {{fieldType}}, the {{multiterm}} analyzers are not persisted (only 
{{index}} and {{query}} analyzers are).

This bug seems to have originated from LUCENE-8497, which refactored this code 
area substantially. The bug is caused by the managed schema being able to READ 
in the {{multiterm}} analyzers from the schema file, but then being unable to 
write them out. Since pushing the schema directly to Zookeeper only requires 
Solr reading them in, this bug would not have been obvious in initial testing. 
However, since the schema API reads in the schema file, writes an updated 
schema out to Zookeeper (where the bug occurs), and then reads the file back 
in, all of the {{multiTerm}} analyzers get stripped out.

I've identified the problematic code and am looking into an appropriate fix.


> Multiterm Analyzer Not Persisted in Managed Schema
> --
>
> Key: SOLR-14434
> URL: https://issues.apache.org/jira/browse/SOLR-14434
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Schema and Analysis
>Affects Versions: 8.0, 8.1, 8.2, 8.1.1, 8.3, 8.4, 8.3.1, 8.5, 8.4.1, 8.5.1
>Reporter: Trey Grainger
>Priority: Major
>
> In addition to "{{index}}" and "{{query}}" analyzers, Solr supports adding an 
> explicit "{{multiterm}}" analyzer to schema {{fieldType}} definitions. This 
> allows for specific control over analysis for things like wildcard terms, 
> prefix queries, range queries, etc. For example, the following would cause 
> the wildcard query for "{{hats*}}" to get stemmed to "{{hat*}}" instead of 
> "{{hats*}}", and thus match on the indexed version of "{{hat}}".
> {code:java}
>positionIncrementGap="100" termOffsets="true" termVectors="true">
> 
>   
>   
>   
> 
> 
>   
>ignoreCase="true" synonyms="synonyms.txt"/>
>   
>   
> 
> 
>   
>   
>   
> 
>   {code}
> This works fine if using a non-managed schema (i.e. {{schema.xml}} file) OR 
> if you use managed schema (i.e. {{managed-schema}} file) and push your schema 
> directly to Zookeeper. However, starting with Solr 8.0, if you use the Schema 
> API to add a {{fieldType}}, the

[jira] [Created] (SOLR-14434) Multiterm Analyzer Not Persisted in Managed Schema

2020-04-23 Thread Trey Grainger (Jira)

Trey Grainger created SOLR-14434:


 Summary: Multiterm Analyzer Not Persisted in Managed Schema
 Key: SOLR-14434
 URL: https://issues.apache.org/jira/browse/SOLR-14434
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: Schema and Analysis
Affects Versions: 8.5.1, 8.4.1, 8.5, 8.3.1, 8.4, 8.3, 8.1.1, 8.2, 8.1, 8.0
Reporter: Trey Grainger


In addition to "{{index}}" and "{{query}}" analyzers, Solr supports adding an 
explicit "{{multiterm}}" analyzer to schema f\{{ieldType}} definitions. This 
allows for specific control over analysis for things like wildcard terms, 
prefix queries, range queries, etc. For example, the following would cause the 
wildcard query for "{{hats*}}" to get stemmed to "{{hat*}}" instead of 
"{{hats*}}", and thus match on the indexed version of "{{hat}}".
{code:java}
  

  
  
  


  
  
  
  


  
  
  

  {code}
This works fine if using a non-managed schema (i.e. {{schema.xml}} file) OR if 
you use managed schema (i.e. {{managed-schema}} file) and push your schema 
directly to Zookeeper. However, starting with Solr 8.0, if you use the Schema 
API to add a {{fieldType}}, the {{multiterm}} analyzers are not persisted (only 
{{index}} and {{query}} analyzers are).

This bug seems to have originated from LUCENE-8497, which refactored this code 
area substantially. The bug is caused by the managed schema being able to READ 
in the {{multiterm}} analyzers from the schema file, but then being unable to 
write them out. Since pushing the schema directly to Zookeeper only requires 
Solr reading them in, this bug would not have been obvious in initial testing. 
However, since the schema API reads in the schema file, writes an updated 
schema out to Zookeeper (where the bug occurs), and then reads the file back 
in, all of the {{multiTerm}} analyzers get stripped out.

I've identified the problematic code and am looking into an appropriate fix.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14414) New Admin UI

2020-04-23 Thread Marcus Eagan (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17091110#comment-17091110
 ] 

Marcus Eagan commented on SOLR-14414:
-

I think that makes sense. I will post some pros and cons of each solution in 
the SIP so that everyone has the information that I have.

> New Admin UI
> 
>
> Key: SOLR-14414
> URL: https://issues.apache.org/jira/browse/SOLR-14414
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Admin UI
>Affects Versions: master (9.0)
>Reporter: Marcus Eagan
>Priority: Major
> Attachments: QueryUX-SolrAdminUIReboot.mov
>
>
> We have had a lengthy discussion in the mailing list about the need to build 
> a modern UI that is both more security and does not depend on deprecated, end 
> of life code. In this ticket, I intend to familiarize the community with the 
> efforts of the community to do just that that. While we are nearing feature 
> parity, but not there yet as many have suggested we could complete this task 
> in iterations, here is an attempt to get the ball rolling. I have mostly 
> worked on it in weekend nights on the occasion that I could find the time. 
> Angular is certainly not my specialty, and this is my first attempt at using 
> TypeScript besides a few brief learning exercises here and there. However, I 
> will be engaging experts in both of these areas for consultation as our 
> community tries to pull our UI into another era.
> Many of the components here can improve. One or two them need to be 
> rewritten, and there are even at least three essential components to the app 
> missing, along with some tests. A couple other things missing are the V2 API, 
>  which I found difficult to build with in this context because it is not 
> documented on the web. I understand that it is "self-documenting," but the 
> most easy-to-use APIs are still documented on the web. Maybe it is entirely 
> documented on the web, and I had trouble finding it. Forgive me, as that 
> could be an area of assistance. Another area where I need assistance is 
> packaging this application as a Solr package. I understand this app is not in 
> the right place for that today, but it can be. There are still many 
> improvements to be made in this Jira and certainly in this code.
> The project is located in {{lucene-solr/solr/webapp2}}, where there is a 
> README for information on running the app.
> The app can be started from the this directory with {{npm start}} for now. It 
> can quickly be modified to start as a part of the typical start commands as 
> it approaches parity. I expect there will be a lot of opinions. I welcome 
> them, of course. The community input should drive the project's success. 
> Discussion in mailing list: 
> https://mail-archives.apache.org/mod_mbox/lucene-dev/202004.mbox/%3CCAF76exK-EB_tyFx0B4fBiA%3DJj8gH%3Divn2Uo6cWvMwhvzRdA3KA%40mail.gmail.com%3E



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-14414) New Admin UI

2020-04-23 Thread Marcus Eagan (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17091010#comment-17091010
 ] 

Marcus Eagan edited comment on SOLR-14414 at 4/24/20, 2:49 AM:
---

Ok Tomas. I intend to agree broadly. I even filed the disabled PR. It will
be reviewed and merged sooner than the UI for sure.

 I think that it's fine if the package is pulled in as a pinned version and
included in the CI pipeline. I'm particularly not looking forward to
working with Noble but I hope that we can be respectful to drive this
project to completion. That way, the admin UI can iterate faster. But
perhaps that should be a future state because that requires a lot more work
and bureaucracy from someone here. I'm happy  to do a lot of the leg work
on the actual build, recruit devs to pitch in, and even host a server
teaching people who to use the Admin UI. If the desire is to have it live
in a separate repository, then [~janhoy] can you work
on this ASAP.

Jeremy, the developer who originally started the Angular project and wrote
the first page of the project that I have been building upon, and I have
discussed the effort and regular meetings. I think we could move really
fast working together, but I won't be able to say how fast until the end
of next week.



On Thu, Apr 23, 2020 at 4:10 PM Tomas Eduardo Fernandez Lobbe (Jira) <




was (Author: marcussorealheis):
Ok Tomas. I intend to agree broadly. I even filed the disabled PR. It will
be reviewed and merged sooner than the UI for sure.

 I think that it's fine if the package is pulled in as a pinned version and
included in the CI pipeline. I'm particularly not looking forward to
working with Noble but I hope that we can be respectful to drive this
project to completion. That way, the admin UI can iterate faster. But
perhaps that should be a future state because that requires a lot more work
and bureaucracy from someone here. I'm happy  to do a lot of the leg work
on the actual build, recruit devs to pitch in, and even host a server
teaching people who to use the Admin UI. If the desire is to have it live
in a separate repository, then @Jan Høydahl  can you work
on this ASAP.

Jeremy, the developer who originally started the Angular project and wrote
the first page of the project that I have been building upon, and I have
discussed the effort and regular meetings. I think we could move really
fast working together, but I won't be able to say how fast until the end
of next week.



On Thu, Apr 23, 2020 at 4:10 PM Tomas Eduardo Fernandez Lobbe (Jira) <



> New Admin UI
> 
>
> Key: SOLR-14414
> URL: https://issues.apache.org/jira/browse/SOLR-14414
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Admin UI
>Affects Versions: master (9.0)
>Reporter: Marcus Eagan
>Priority: Major
> Attachments: QueryUX-SolrAdminUIReboot.mov
>
>
> We have had a lengthy discussion in the mailing list about the need to build 
> a modern UI that is both more security and does not depend on deprecated, end 
> of life code. In this ticket, I intend to familiarize the community with the 
> efforts of the community to do just that that. While we are nearing feature 
> parity, but not there yet as many have suggested we could complete this task 
> in iterations, here is an attempt to get the ball rolling. I have mostly 
> worked on it in weekend nights on the occasion that I could find the time. 
> Angular is certainly not my specialty, and this is my first attempt at using 
> TypeScript besides a few brief learning exercises here and there. However, I 
> will be engaging experts in both of these areas for consultation as our 
> community tries to pull our UI into another era.
> Many of the components here can improve. One or two them need to be 
> rewritten, and there are even at least three essential components to the app 
> missing, along with some tests. A couple other things missing are the V2 API, 
>  which I found difficult to build with in this context because it is not 
> documented on the web. I understand that it is "self-documenting," but the 
> most easy-to-use APIs are still documented on the web. Maybe it is entirely 
> documented on the web, and I had trouble finding it. Forgive me, as that 
> could be an area of assistance. Another area where I need assistance is 
> packaging this application as a Solr package. I understand this app is not in 
> the right place for that today, but it can be. There are still many 
> improvements to be made in this Jira and certainly in this code.
> The project is located in {{lucene-solr/solr/webapp2}}, where there is a 
> README for information on running the app.
> The app can be started from the this directory with {{npm start}} for now. It 
> can

[jira] [Commented] (SOLR-13289) Support for BlockMax WAND

2020-04-23 Thread Ishan Chattopadhyaya (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17091106#comment-17091106
 ] 

Ishan Chattopadhyaya commented on SOLR-13289:
-

[~tflobbe], I was working on this last month and I'm actually much farther 
along on the patch than what I put here. I'll put together an updated patch by 
next week, and we can collaborate on this from there. WDYT?

> Support for BlockMax WAND
> -
>
> Key: SOLR-13289
> URL: https://issues.apache.org/jira/browse/SOLR-13289
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13289.patch, SOLR-13289.patch
>
>
> LUCENE-8135 introduced BlockMax WAND as a major speed improvement. Need to 
> expose this via Solr. When enabled, the numFound returned will not be exact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13325) Add a collection selector to triggers

2020-04-23 Thread Shalin Shekhar Mangar (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17091090#comment-17091090
 ] 

Shalin Shekhar Mangar commented on SOLR-13325:
--

I'm looking at this again. I think we should change the syntax slightly and get 
rid of the {{#policy}} key name. Instead, this can operate on any collection 
property such as policy or configName or autoAddReplicas etc that are part of 
the collection state. What's slightly complicating is that there are additional 
collection properties (stored in collectionprops.json). I don't intend to 
support that at the moment. On a related note, collection props have write APIs 
but no read APIs which severely limit the usefulness of that feature? That's 
something we should fix separately.

Now once we have this working, it reduces the need for a separate 
AutoAddReplicasPlanAction because you can get the same behavior by setting the 
following in ComputePlanAction:
{code}
"collection": {"autoAddReplicas": "true"}
{code}
However, there is a difference between the current implementation of 
"collections" in ComputePlanAction and how AutoAddReplicasPlanAction works 
which is that the former filters out suggestions of non-matching collections 
but the latter pushes down the collection hint to the policy engine so that it 
doesn't even compute suggestions for non-matching collections in the first 
place. The latter is obviously more efficient.

The one thing we have to be careful about is that the list of matching 
collections should be evaluated lazily when the action is triggered instead of 
early in the init method so that it can *see* the changes in the cluster state.

> Add a collection selector to triggers
> -
>
> Key: SOLR-13325
> URL: https://issues.apache.org/jira/browse/SOLR-13325
> Project: Solr
>  Issue Type: Improvement
>  Components: AutoScaling
>Reporter: Shalin Shekhar Mangar
>Priority: Major
> Fix For: master (9.0), 8.2
>
>
> Similar to SOLR-13273, it'd be nice to have a collection selector that 
> applies to triggers. An example use-case would be to selectively add replicas 
> on new nodes for certain collections only.
> Here is a selector that returns collections that match the given collection 
> property/value pair:
> {code}
> "collection": {"property_name": "property_value"}
> {code}
> Here's another selector that returns collections that have the given policy 
> applied
> {code}
> "collection": {"#policy": "policy_name"}
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14414) New Admin UI

2020-04-23 Thread Jira



[ 
https://issues.apache.org/jira/browse/SOLR-14414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17091022#comment-17091022
 ] 

Jan Høydahl commented on SOLR-14414:


I prefer if we can work on SIP level for now, expand on and understand the 
options, continue iterating mainly in the email-thread, not in PR or JIRA, 
understand the pro/con of various options, and finally end up with a SIP 
proposal that has broad support.

Please understand, I have not *decided* on either a separate repo/subproject or 
on Vue, I just try help guide the process of making sure we all consider our 
options before diving into code and getting veto'ed.

And as always, keeping things simple in first phase is always a good idea, i.e. 
forget about package manager and running npm through Java for now.

> New Admin UI
> 
>
> Key: SOLR-14414
> URL: https://issues.apache.org/jira/browse/SOLR-14414
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Admin UI
>Affects Versions: master (9.0)
>Reporter: Marcus Eagan
>Priority: Major
> Attachments: QueryUX-SolrAdminUIReboot.mov
>
>
> We have had a lengthy discussion in the mailing list about the need to build 
> a modern UI that is both more security and does not depend on deprecated, end 
> of life code. In this ticket, I intend to familiarize the community with the 
> efforts of the community to do just that that. While we are nearing feature 
> parity, but not there yet as many have suggested we could complete this task 
> in iterations, here is an attempt to get the ball rolling. I have mostly 
> worked on it in weekend nights on the occasion that I could find the time. 
> Angular is certainly not my specialty, and this is my first attempt at using 
> TypeScript besides a few brief learning exercises here and there. However, I 
> will be engaging experts in both of these areas for consultation as our 
> community tries to pull our UI into another era.
> Many of the components here can improve. One or two them need to be 
> rewritten, and there are even at least three essential components to the app 
> missing, along with some tests. A couple other things missing are the V2 API, 
>  which I found difficult to build with in this context because it is not 
> documented on the web. I understand that it is "self-documenting," but the 
> most easy-to-use APIs are still documented on the web. Maybe it is entirely 
> documented on the web, and I had trouble finding it. Forgive me, as that 
> could be an area of assistance. Another area where I need assistance is 
> packaging this application as a Solr package. I understand this app is not in 
> the right place for that today, but it can be. There are still many 
> improvements to be made in this Jira and certainly in this code.
> The project is located in {{lucene-solr/solr/webapp2}}, where there is a 
> README for information on running the app.
> The app can be started from the this directory with {{npm start}} for now. It 
> can quickly be modified to start as a part of the typical start commands as 
> it approaches parity. I expect there will be a lot of opinions. I welcome 
> them, of course. The community input should drive the project's success. 
> Discussion in mailing list: 
> https://mail-archives.apache.org/mod_mbox/lucene-dev/202004.mbox/%3CCAF76exK-EB_tyFx0B4fBiA%3DJj8gH%3Divn2Uo6cWvMwhvzRdA3KA%40mail.gmail.com%3E



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14414) New Admin UI

2020-04-23 Thread Marcus Eagan (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17091010#comment-17091010
 ] 

Marcus Eagan commented on SOLR-14414:
-

Ok Tomas. I intend to agree broadly. I even filed the disabled PR. It will
be reviewed and merged sooner than the UI for sure.

 I think that it's fine if the package is pulled in as a pinned version and
included in the CI pipeline. I'm particularly not looking forward to
working with Noble but I hope that we can be respectful to drive this
project to completion. That way, the admin UI can iterate faster. But
perhaps that should be a future state because that requires a lot more work
and bureaucracy from someone here. I'm happy  to do a lot of the leg work
on the actual build, recruit devs to pitch in, and even host a server
teaching people who to use the Admin UI. If the desire is to have it live
in a separate repository, then @Jan Høydahl  can you work
on this ASAP.

Jeremy, the developer who originally started the Angular project and wrote
the first page of the project that I have been building upon, and I have
discussed the effort and regular meetings. I think we could move really
fast working together, but I won't be able to say how fast until the end
of next week.



On Thu, Apr 23, 2020 at 4:10 PM Tomas Eduardo Fernandez Lobbe (Jira) <



> New Admin UI
> 
>
> Key: SOLR-14414
> URL: https://issues.apache.org/jira/browse/SOLR-14414
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Admin UI
>Affects Versions: master (9.0)
>Reporter: Marcus Eagan
>Priority: Major
> Attachments: QueryUX-SolrAdminUIReboot.mov
>
>
> We have had a lengthy discussion in the mailing list about the need to build 
> a modern UI that is both more security and does not depend on deprecated, end 
> of life code. In this ticket, I intend to familiarize the community with the 
> efforts of the community to do just that that. While we are nearing feature 
> parity, but not there yet as many have suggested we could complete this task 
> in iterations, here is an attempt to get the ball rolling. I have mostly 
> worked on it in weekend nights on the occasion that I could find the time. 
> Angular is certainly not my specialty, and this is my first attempt at using 
> TypeScript besides a few brief learning exercises here and there. However, I 
> will be engaging experts in both of these areas for consultation as our 
> community tries to pull our UI into another era.
> Many of the components here can improve. One or two them need to be 
> rewritten, and there are even at least three essential components to the app 
> missing, along with some tests. A couple other things missing are the V2 API, 
>  which I found difficult to build with in this context because it is not 
> documented on the web. I understand that it is "self-documenting," but the 
> most easy-to-use APIs are still documented on the web. Maybe it is entirely 
> documented on the web, and I had trouble finding it. Forgive me, as that 
> could be an area of assistance. Another area where I need assistance is 
> packaging this application as a Solr package. I understand this app is not in 
> the right place for that today, but it can be. There are still many 
> improvements to be made in this Jira and certainly in this code.
> The project is located in {{lucene-solr/solr/webapp2}}, where there is a 
> README for information on running the app.
> The app can be started from the this directory with {{npm start}} for now. It 
> can quickly be modified to start as a part of the typical start commands as 
> it approaches parity. I expect there will be a lot of opinions. I welcome 
> them, of course. The community input should drive the project's success. 
> Discussion in mailing list: 
> https://mail-archives.apache.org/mod_mbox/lucene-dev/202004.mbox/%3CCAF76exK-EB_tyFx0B4fBiA%3DJj8gH%3Divn2Uo6cWvMwhvzRdA3KA%40mail.gmail.com%3E



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14414) New Admin UI

2020-04-23 Thread Tomas Eduardo Fernandez Lobbe (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17091006#comment-17091006
 ] 

Tomas Eduardo Fernandez Lobbe commented on SOLR-14414:
--

Thanks for working on this Marcus. I personally agree 100% with what [~gus] 
said on the email thread.
* The UI is critical for Solr. if it's a package for modularization purposes 
may be fine (I didn't look much into the packages design yet), but I think it 
needs to be on by default. Having the ability to "disable" it could be nice 
too, but not as important to me, I'd always want it on.
* It's very difficult to have the UI live in a separate repo and not fall out 
of sync. Can't be compared with Kibana or any other enterprise product. I don't 
think this is a good idea.
* I'd love for the UI to run in the same process as Solr and not have to 
start/monitor another app.

> New Admin UI
> 
>
> Key: SOLR-14414
> URL: https://issues.apache.org/jira/browse/SOLR-14414
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Admin UI
>Affects Versions: master (9.0)
>Reporter: Marcus Eagan
>Priority: Major
> Attachments: QueryUX-SolrAdminUIReboot.mov
>
>
> We have had a lengthy discussion in the mailing list about the need to build 
> a modern UI that is both more security and does not depend on deprecated, end 
> of life code. In this ticket, I intend to familiarize the community with the 
> efforts of the community to do just that that. While we are nearing feature 
> parity, but not there yet as many have suggested we could complete this task 
> in iterations, here is an attempt to get the ball rolling. I have mostly 
> worked on it in weekend nights on the occasion that I could find the time. 
> Angular is certainly not my specialty, and this is my first attempt at using 
> TypeScript besides a few brief learning exercises here and there. However, I 
> will be engaging experts in both of these areas for consultation as our 
> community tries to pull our UI into another era.
> Many of the components here can improve. One or two them need to be 
> rewritten, and there are even at least three essential components to the app 
> missing, along with some tests. A couple other things missing are the V2 API, 
>  which I found difficult to build with in this context because it is not 
> documented on the web. I understand that it is "self-documenting," but the 
> most easy-to-use APIs are still documented on the web. Maybe it is entirely 
> documented on the web, and I had trouble finding it. Forgive me, as that 
> could be an area of assistance. Another area where I need assistance is 
> packaging this application as a Solr package. I understand this app is not in 
> the right place for that today, but it can be. There are still many 
> improvements to be made in this Jira and certainly in this code.
> The project is located in {{lucene-solr/solr/webapp2}}, where there is a 
> README for information on running the app.
> The app can be started from the this directory with {{npm start}} for now. It 
> can quickly be modified to start as a part of the typical start commands as 
> it approaches parity. I expect there will be a lot of opinions. I welcome 
> them, of course. The community input should drive the project's success. 
> Discussion in mailing list: 
> https://mail-archives.apache.org/mod_mbox/lucene-dev/202004.mbox/%3CCAF76exK-EB_tyFx0B4fBiA%3DJj8gH%3Divn2Uo6cWvMwhvzRdA3KA%40mail.gmail.com%3E



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14414) New Admin UI

2020-04-23 Thread Tomas Eduardo Fernandez Lobbe (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomas Eduardo Fernandez Lobbe updated SOLR-14414:
-
Description: 
We have had a lengthy discussion in the mailing list about the need to build a 
modern UI that is both more security and does not depend on deprecated, end of 
life code. In this ticket, I intend to familiarize the community with the 
efforts of the community to do just that that. While we are nearing feature 
parity, but not there yet as many have suggested we could complete this task in 
iterations, here is an attempt to get the ball rolling. I have mostly worked on 
it in weekend nights on the occasion that I could find the time. Angular is 
certainly not my specialty, and this is my first attempt at using TypeScript 
besides a few brief learning exercises here and there. However, I will be 
engaging experts in both of these areas for consultation as our community tries 
to pull our UI into another era.

Many of the components here can improve. One or two them need to be rewritten, 
and there are even at least three essential components to the app missing, 
along with some tests. A couple other things missing are the V2 API,  which I 
found difficult to build with in this context because it is not documented on 
the web. I understand that it is "self-documenting," but the most easy-to-use 
APIs are still documented on the web. Maybe it is entirely documented on the 
web, and I had trouble finding it. Forgive me, as that could be an area of 
assistance. Another area where I need assistance is packaging this application 
as a Solr package. I understand this app is not in the right place for that 
today, but it can be. There are still many improvements to be made in this Jira 
and certainly in this code.

The project is located in {{lucene-solr/solr/webapp2}}, where there is a README 
for information on running the app.

The app can be started from the this directory with {{npm start}} for now. It 
can quickly be modified to start as a part of the typical start commands as it 
approaches parity. I expect there will be a lot of opinions. I welcome them, of 
course. The community input should drive the project's success. 

Discussion in mailing list: 
https://mail-archives.apache.org/mod_mbox/lucene-dev/202004.mbox/%3CCAF76exK-EB_tyFx0B4fBiA%3DJj8gH%3Divn2Uo6cWvMwhvzRdA3KA%40mail.gmail.com%3E

  was:
We have had a lengthy discussion in the mailing list about the need to build a 
modern UI that is both more security and does not depend on deprecated, end of 
life code. In this ticket, I intend to familiarize the community with the 
efforts of the community to do just that that. While we are nearing feature 
parity, but not there yet as many have suggested we could complete this task in 
iterations, here is an attempt to get the ball rolling. I have mostly worked on 
it in weekend nights on the occasion that I could find the time. Angular is 
certainly not my specialty, and this is my first attempt at using TypeScript 
besides a few brief learning exercises here and there. However, I will be 
engaging experts in both of these areas for consultation as our community tries 
to pull our UI into another era.

Many of the components here can improve. One or two them need to be rewritten, 
and there are even at least three essential components to the app missing, 
along with some tests. A couple other things missing are the V2 API,  which I 
found difficult to build with in this context because it is not documented on 
the web. I understand that it is "self-documenting," but the most easy-to-use 
APIs are still documented on the web. Maybe it is entirely documented on the 
web, and I had trouble finding it. Forgive me, as that could be an area of 
assistance. Another area where I need assistance is packaging this application 
as a Solr package. I understand this app is not in the right place for that 
today, but it can be. There are still many improvements to be made in this Jira 
and certainly in this code.

The project is located in {{lucene-solr/solr/webapp2}}, where there is a README 
for information on running the app.

The app can be started from the this directory with {{npm start}} for now. It 
can quickly be modified to start as a part of the typical start commands as it 
approaches parity. I expect there will be a lot of opinions. I welcome them, of 
course. The community input should drive the project's success. 


> New Admin UI
> 
>
> Key: SOLR-14414
> URL: https://issues.apache.org/jira/browse/SOLR-14414
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Admin UI
>Affects Versions: master (9.0)
>Reporter: Marcus Eagan
>Priority: Major
> Attachments:

[GitHub] [lucene-solr] dsmiley opened a new pull request #1453: SOLR-14433: Improve SolrShardReporter default metrics list

2020-04-23 Thread GitBox



dsmiley opened a new pull request #1453:
URL: https://github.com/apache/lucene-solr/pull/1453


   https://issues.apache.org/jira/browse/SOLR-14433#
   
   Now includes TLOG and UPDATE./update.
   These were small bugs to begin with but from user perspective this is an 
incremental improvement.
   
   CC @sigram 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14433) Improve default metrics collected by SolrShardReporter

2020-04-23 Thread David Smiley (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley updated SOLR-14433:

Description: 
SolrShardReporter's default metric filters have two problems:
 * "Tlog.\*" should be "TLOG.\*" (a bug)
 * "UPDATE
./update/.\*requests" should be "UPDATE
./update.\*requests"  (notice removal of one '/')

Today, the first was fixed and tagged to the issue that incorrectly made this 
change – SOLR-12690.  What remains is the other.

CC [~ab]

  was:
SolrShardReporter's default metric filters have two problems:
 * "Tlog.*" should be "TLOG.*" (a bug)
 * "UPDATE\\./update/.*requests" should be "UPDATE\\./update.*requests"  
(notice removal of one '/')

Today, the first was fixed and tagged to the issue that incorrectly made this 
change – SOLR-12690.  What remains is the other.

CC [~ab]


> Improve default metrics collected by SolrShardReporter
> --
>
> Key: SOLR-14433
> URL: https://issues.apache.org/jira/browse/SOLR-14433
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Reporter: David Smiley
>Assignee: David Smiley
>Priority: Minor
>
> SolrShardReporter's default metric filters have two problems:
>  * "Tlog.\*" should be "TLOG.\*" (a bug)
>  * "UPDATE
> ./update/.\*requests" should be "UPDATE
> ./update.\*requests"  (notice removal of one '/')
> Today, the first was fixed and tagged to the issue that incorrectly made this 
> change – SOLR-12690.  What remains is the other.
> CC [~ab]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (SOLR-14433) Improve default metrics collected by SolrShardReporter

2020-04-23 Thread David Smiley (Jira)

David Smiley created SOLR-14433:
---

 Summary: Improve default metrics collected by SolrShardReporter
 Key: SOLR-14433
 URL: https://issues.apache.org/jira/browse/SOLR-14433
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
  Components: metrics
Reporter: David Smiley
Assignee: David Smiley


SolrShardReporter's default metric filters have two problems:
 * "Tlog.*" should be "TLOG.*" (a bug)
 * "UPDATE\\./update/.*requests" should be "UPDATE\\./update.*requests"  
(notice removal of one '/')

Today, the first was fixed and tagged to the issue that incorrectly made this 
change – SOLR-12690.  What remains is the other.

CC [~ab]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14430) Authorization plugins should check roles from request

2020-04-23 Thread Jira



[ 
https://issues.apache.org/jira/browse/SOLR-14430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090943#comment-17090943
 ] 

Jan Høydahl commented on SOLR-14430:


The JWTAuth plugin wraps the user principal in the class 
{{JWTPrincipalWithUserRoles}} which implements 
{{org.apache.solr.security.VerifiedUserRoles}}

{code:java}
Set getVerifiedRoles();
{code}

Currently that class is not used other than in tests but my next idea was to 
implement SOLR-12131 which adds a new class 
[ExternalRoleRuleBasedAuthorizationPlugin|https://github.com/apache/lucene-solr/pull/341/files#diff-1605e924a4ccb6bddd1f776e54b8f2cd]
 which reads the roles from the request (VerifiedUserRoles) instead of from a 
user->role mapping.

Hope you can review my PR and tell what you think about that approach.

> Authorization plugins should check roles from request
> -
>
> Key: SOLR-14430
> URL: https://issues.apache.org/jira/browse/SOLR-14430
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: security
>Reporter: Mike Drob
>Priority: Major
>
> The AuthorizationContext exposes {{getUserPrincipal}} to the plugin, but it 
> does not allow the plugin to interrogate the request for {{isUserInRole}}. If 
> we trust the request enough to get a principal from it, then we should trust 
> it enough to ask about roles, as those could have been defined and verified 
> by an authentication plugin.
> This model would be an alternative to the current model where 
> RuleBasedAuthorizationPlugin maintains its own user->role mapping.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1450: SOLR-14429: Convert XXX.txt files to proper XXX.md

2020-04-23 Thread GitBox



dsmiley commented on a change in pull request #1450:
URL: https://github.com/apache/lucene-solr/pull/1450#discussion_r414110471



##
File path: solr/example/files/README.md
##
@@ -111,38 +120,46 @@ For further explanations, see the frequently asked 
questions at the end of the g
 * Another way to query the index is by manipulating the URL in your address 
bar once in the browse view.
 
 * i.e. : 
[http://localhost:8983/solr/files/browse?q=Lucene](http://localhost:8983/solr/files/browse?q=Lucene)
-
-##FAQs
+
+## FAQs
 
 * Why use -d when creating a core?
* -d specifies a specific configuration to use.  This example as a 
configuration tuned for indexing and query rich
  text files.

 * How do I delete a core?
-   * To delete a core (i.e. files), you can enter the following in your 
command shell:
-   bin/solr delete -c files
-
-   * You should see the following output:
+  * To delete a core (i.e. files), you can enter the following in your command 
shell:

-   Deleting core 'files' using command:
-   
http://localhost:8983/solr/admin/cores?action=UNLOAD=files=true=true=true
+```
+bin/solr delete -c files
+```
+ 
+  * You should see the following output:
+   
+   Deleting core 'files' using command:
+   
+   ```
+   
http://localhost:8983/solr/admin/cores?action=UNLOAD=files=true=true=true
 
-   {"responseHeader":{
-   "status":0,
-   "QTime":19}}
-
-   * This calls the Solr core admin handler, "UNLOAD", and the parameters 
"deleteDataDir" and "deleteInstanceDir" to ensure that all data associated with 
core is also removed
+   {"responseHeader":{
+   "status":0,
+   "QTime":19}}
+```
+  
+  * This calls the Solr core admin handler, "UNLOAD", and the parameters 
"deleteDataDir" and "deleteInstanceDir" to ensure that all data associated with 
core is also removed
 
 * How can I change the /browse UI?
 
-   The primary templates are under example/files/conf/velocity.  **In 
order to edit those files in place (without having to
-   re-create or patch a core/collection with an updated configuration)**, 
Solr can be started with a special system property
-   set to the _absolute_ path to the conf/velocity directory, like this: 
-   
-   bin/solr start 
-Dvelocity.template.base.dir=/example/files/conf/velocity/
+The primary templates are under example/files/conf/velocity.  **In order 
to edit those files in place (without having to
+re-create or patch a core/collection with an updated configuration)**, 
Solr can be started with a special system property
+set to the _absolute_ path to the conf/velocity directory, like this: 

-If you want to adjust the browse templates for an existing collection, 
edit the core’s configuration
-under server/solr/files/conf/velocity.
+```
+bin/solr start 
-Dvelocity.template.base.dir=/example/files/conf/velocity/
+```
+ 
+If you want to adjust the browse templates for an existing collection, edit 
the core’s configuration
+under server/solr/files/conf/velocity.
 
 
 ===

Review comment:
   At least on GitHub, this isn't showing as markup.  (I assume this line 
is at the end directly about the provenance info)

##
File path: solr/example/README.md
##
@@ -1,57 +1,74 @@
-# Licensed to the Apache Software Foundation (ASF) under one or more
-# contributor license agreements.  See the NOTICE file distributed with
-# this work for additional information regarding copyright ownership.
-# The ASF licenses this file to You under the Apache License, Version 2.0
-# (the "License"); you may not use this file except in compliance with
-# the License.  You may obtain a copy of the License at
-#
-# http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
+
 
 Solr example
 
 
 This directory contains Solr examples. Each example is contained in a 
 separate directory. To run a specific example, do:
 
+```
   bin/solr -e  where  is one of:
   
 cloud: SolrCloud example
 dih  : Data Import Handler (rdbms, mail, atom, tika)
 schemaless   : Schema-less example (schema is inferred from data during 
indexing)
 techproducts : Kitchen sink example providing comprehensive examples of 
Solr features
+```
 
 For instance, if you want to run the Solr Data Import Handler example, do:
 
+```
   bin/solr -e dih
-  
+```
+
 To see all the options available when starting Solr:
 
+```
   bin/solr

[jira] [Commented] (SOLR-13132) Improve JSON "terms" facet performance when sorted by relatedness

2020-04-23 Thread Michael Gibney (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090931#comment-17090931
 ] 

Michael Gibney commented on SOLR-13132:
---

Thanks, Hoss! Some initial responses re: some of the nocommit comments from 
[8fcd6271b6|https://github.com/magibney/lucene-solr/commit/8fcd6271b684da589ddae8e4319b564249ee76cb]:

{code:java}
// nocommit: for that matter: can we eliminate SweepingAcc as a class,
// nocommit: and just roll that specific logic into CountSlotAcc?
// nocommit: IIUC: there should only ever be a single SweepingAcc instance,
// nocommit: and callers should never use/instantiate a SweepingAcc w/o 
using 'countAcc' ... correct?
{code}
That would definitely work. However, my initial inclination is to prefer 
leaving {{SweepingAcc}} as a separate class, because {{CountSlotAcc}} currently 
clearly does one specific thing, and folding the {{SweepingAcc}} functionality 
(which could be relatively complex -- potentially deduping {{DocSets}}, etc...) 
would mix in a different type of functionality that's only relevant in some of 
the contexts where {{CountSlotAcc}} is currently used. Accessing 
{{SweepingAcc}} via {{countAcc.getBaseSweepingAcc()}} strikes a balance of 
using {{countAcc}} as a coordination point for related but distinct 
functionality ... perhaps a cleaner separation of concerns?

{code:java}
// nocommit: since 'countAcc' is now the special place all sweeping is tracked, 
it seems
// nocommit: unneccessary (and uneccessarly confusing) for it to also be a 
'SweepableSlotAcc'
// nocommit: any reason not to just remove this?
abstract class CountSlotAcc extends SlotAcc implements ReadOnlyCountSlotAcc /*, 
SweepableSlotAcc ... nocommit... */ {
...
  // nocommit: CountSlotAcc no longer implements SweepableSlotAcc...
  // @Override
  // public CountSlotAcc registerSweepingAccs(SweepingAcc baseSweepingAcc) {
  //   baseSweepingAcc.add(new SweepCountAccStruct(fcontext.base, false, this, 
this));
  //   baseSweepingAcc.registerMapping(this, this);
  //   return null;
  // }
{code}
True, I'm glad you mentioned this. I left this in partly to illustrate another 
concrete case (aside from SKG) for which sweep collection might be useful. In 
its current state it admittedly seems a bit contrived, but my thinking was: 
although {{countAcc}} is currently the one and only {{CountSlotAcc}}, used to 
accumulate counts over the base domain {{DocSet}} only, there could be cases 
where extra {{CountSlotAccs}} are used more directly (e.g. as part of stats 
collection, analogous to how they're used indirectly for SKG sweep collection). 
In such a case, these "non-base" {{CountSlotAccs}} would respond as implemented 
in the above {{registerSweepingAccs(...)}} method. More practically speaking, 
it also occurred to me that one promising use of sweep collection would be to 
accumulate counts over all subfacet domains in a single sweep (for 
nested/sub-facets, pivot facets, what-have-you) -- not sure if that would be 
directly accommodated by the current incarnation of the "sweeping", but it 
might be a use case to consider. With all that said, I'm not at all opposed to 
removing the {{SweepableSlotAcc}} interface from {{CountSlotAcc}}; it should 
anyway be straightforward to add back in later should the need arise.

> Improve JSON "terms" facet performance when sorted by relatedness 
> --
>
> Key: SOLR-13132
> URL: https://issues.apache.org/jira/browse/SOLR-13132
> Project: Solr
>  Issue Type: Improvement
>  Components: Facet Module
>Affects Versions: 7.4, master (9.0)
>Reporter: Michael Gibney
>Priority: Major
> Attachments: SOLR-13132-with-cache-01.patch, 
> SOLR-13132-with-cache.patch, SOLR-13132.patch, SOLR-13132_testSweep.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> When sorting buckets by {{relatedness}}, JSON "terms" facet must calculate 
> {{relatedness}} for every term. 
> The current implementation uses a standard uninverted approach (either 
> {{docValues}} or {{UnInvertedField}}) to get facet counts over the domain 
> base docSet, and then uses that initial pass as a pre-filter for a 
> second-pass, inverted approach of fetching docSets for each relevant term 
> (i.e., {{count > minCount}}?) and calculating intersection size of those sets 
> with the domain base docSet.
> Over high-cardinality fields, the overhead of per-term docSet creation and 
> set intersection operations increases request latency to the point where 
> relatedness sort may not be usable in practice (for my use case, even after 
> applying the patch for SOLR-13108, for a field with ~220k unique terms per 
> core, QTime for high-cardinality domain docSets were, e.g.: cardinality 
> 1816684=9000ms, cardinality 5032902=18000ms).
> The attached

[jira] [Resolved] (LUCENE-9342) Collector's totalHitsThreshold should not be lower than numHits

2020-04-23 Thread Tomas Eduardo Fernandez Lobbe (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomas Eduardo Fernandez Lobbe resolved LUCENE-9342.
---
Fix Version/s: 8.6
   master (9.0)
   Resolution: Fixed

> Collector's totalHitsThreshold should not be lower than numHits
> ---
>
> Key: LUCENE-9342
> URL: https://issues.apache.org/jira/browse/LUCENE-9342
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Tomas Eduardo Fernandez Lobbe
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Minor
> Fix For: master (9.0), 8.6
>
>
> While looking at SOLR-13289 I noticed this situation. If I create a collector 
> with {{numHits}} greater than {{totalHitsThreshold}}, and the number of hits 
> in the query is somewhere between those two numbers, the collector’s 
> {{totalHitRelation}} will be {{TotalHits.Relation.GREATER_THAN_OR_EQUAL_TO}}, 
> however the count will be accurate in this case. While this doesn't violate 
> the current contract, the {{totalHitRelation}} could be changed to 
> {{TotalHits.Relation.EQUAL_TO}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9342) Collector's totalHitsThreshold should not be lower than numHits

2020-04-23 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090919#comment-17090919
 ] 

ASF subversion and git services commented on LUCENE-9342:
-

Commit edd00d933f6144293d74fc727fec6190f28c57a0 in lucene-solr's branch 
refs/heads/branch_8x from Tomas Eduardo Fernandez Lobbe
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=edd00d9 ]

LUCENE-9342: Collector's totalHitsThreshold should not be lower than numHits 
(#1448)

Use the maximum of the two, this is so that relation is EQUAL_TO in the case of 
the number of hits in a query is less than the collector's numHits


> Collector's totalHitsThreshold should not be lower than numHits
> ---
>
> Key: LUCENE-9342
> URL: https://issues.apache.org/jira/browse/LUCENE-9342
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Tomas Eduardo Fernandez Lobbe
>Priority: Minor
>
> While looking at SOLR-13289 I noticed this situation. If I create a collector 
> with {{numHits}} greater than {{totalHitsThreshold}}, and the number of hits 
> in the query is somewhere between those two numbers, the collector’s 
> {{totalHitRelation}} will be {{TotalHits.Relation.GREATER_THAN_OR_EQUAL_TO}}, 
> however the count will be accurate in this case. While this doesn't violate 
> the current contract, the {{totalHitRelation}} could be changed to 
> {{TotalHits.Relation.EQUAL_TO}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Assigned] (LUCENE-9342) Collector's totalHitsThreshold should not be lower than numHits

2020-04-23 Thread Tomas Eduardo Fernandez Lobbe (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomas Eduardo Fernandez Lobbe reassigned LUCENE-9342:
-

Assignee: Tomas Eduardo Fernandez Lobbe

> Collector's totalHitsThreshold should not be lower than numHits
> ---
>
> Key: LUCENE-9342
> URL: https://issues.apache.org/jira/browse/LUCENE-9342
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Tomas Eduardo Fernandez Lobbe
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Minor
>
> While looking at SOLR-13289 I noticed this situation. If I create a collector 
> with {{numHits}} greater than {{totalHitsThreshold}}, and the number of hits 
> in the query is somewhere between those two numbers, the collector’s 
> {{totalHitRelation}} will be {{TotalHits.Relation.GREATER_THAN_OR_EQUAL_TO}}, 
> however the count will be accurate in this case. While this doesn't violate 
> the current contract, the {{totalHitRelation}} could be changed to 
> {{TotalHits.Relation.EQUAL_TO}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14428) FuzzyQuery has severe memory usage in 8.5

2020-04-23 Thread Mike Drob (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090877#comment-17090877
 ] 

Mike Drob commented on SOLR-14428:
--

Adding some notes here as I've been diving through this... The 
queryResultsCache is built in 
[SolrIndexSearcher|https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L278],
 and uses a {{QueryResultKey}}, which stores the full {{Query}} object, which 
in this case means the FuzzyQuery with the automata already built.

However, we don't really need to store the automata that we built, since they 
aren't used for the equality comparison. Maybe there is an elegant way to store 
a stripped down FuzzyQuery in the cache?

> FuzzyQuery has severe memory usage in 8.5
> -
>
> Key: SOLR-14428
> URL: https://issues.apache.org/jira/browse/SOLR-14428
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 8.5, 8.5.1
>Reporter: Colvin Cowie
>Assignee: Andrzej Bialecki
>Priority: Major
> Attachments: FuzzyHammer.java, image-2020-04-23-09-18-06-070.png, 
> screenshot-2.png, screenshot-3.png, screenshot-4.png
>
>
> I sent this to the mailing list
> I'm moving from 8.3.1 to 8.5.1, and started getting Out Of Memory Errors 
> while running our normal tests. After profiling it was clear that the 
> majority of the heap was allocated through FuzzyQuery.
> LUCENE-9068 moved construction of the automata from the FuzzyTermsEnum to the 
> FuzzyQuery's constructor.
> I created a little test ( [^FuzzyHammer.java] ) that fires off fuzzy queries 
> from random UUID strings for 5 minutes
> {code}
> FIELD_NAME + ":" + UUID.randomUUID().toString().replace("-", "") + "~2"
> {code}
> When running against a vanilla Solr 8.31 and 8.4.1 there is no problem, while 
> the memory usage has increased drastically on 8.5.0 and 8.5.1.
> Comparison of heap usage while running the attached test against Solr 8.3.1 
> and 8.5.1 with a single (empty) shard and 4GB heap:
> !image-2020-04-23-09-18-06-070.png! 
> And with 4 shards on 8.4.1 and 8.5.0:
>  !screenshot-2.png! 
> I'm guessing that the memory might be being leaked if the FuzzyQuery objects 
> are referenced from the cache, while the FuzzyTermsEnum would not have been.
> Query Result Cache on 8.5.1:
>  !screenshot-3.png! 
> ~316mb in the cache
> QRC on 8.3.1
>  !screenshot-4.png! 
> <1mb
> With an empty cache, running this query 
> _field_s:e41848af85d24ac197c71db6888e17bc~2_ results in the following memory 
> allocation
> {noformat}
> 8.3.1: CACHE.searcher.queryResultCache.ramBytesUsed:  1520
> 8.5.1: CACHE.searcher.queryResultCache.ramBytesUsed:648855
> {noformat}
> ~1 gives 98253 and ~0 gives 6339 on 8.5.1. 8.3.1 is constant at 1520



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14432) SOLR Dataimport hanlder going to idle after some time

2020-04-23 Thread Ravi kumar (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi kumar updated SOLR-14432:
--
Description: 
I configured data import handler  to process bulk PDF documents. after process 
21000 documents. Process going to idle and not  processing all the documents.

When i see the log observed below things (attached log for your reference). 

Please let me know is there anyway that i can ignore this issue or any setting 
do i need to update.

 Error:

2020-04-23 18:39:55.749 INFO (qtp215219944-24) [ x:DMS] o.a.s.c.S.Request [DMS] 
webapp=/solr path=/dataimport 
params=\{indent=on=json=status&_=1587664092295} status=0 QTime=0
 2020-04-23 18:39:55.972 WARN (Thread-14) [ ] o.a.p.p.COSParser 
T{color:#de350b}he end of the stream is out of range, using workaround to read 
the stream, stream start position: 4748210, length: 2007324, expected end 
position: 6755534{color}
 2020-04-23 18:39:55.976 WARN (Thread-14) [ ] o.a.p.p.COSParser Removed null 
object COSObject\{50, 0} from pages dictionary
 2020-04-23 18:39:55.976 WARN (Thread-14) [ ] o.a.p.p.COSParser Removed null 
object COSObject\{60, 0} from pages dictionary
 2020-04-23 18:39:55.997 {color:#de350b}ERROR (Thread-14) [ ] 
o.a.p.c.o.s.SetGraphicsStateParameters name for 'gs' operator not found in 
resources: /R7{color}

 

{color:#172b4d}{color:#de350b}Regards,{color}{color}

{color:#172b4d}{color:#de350b}Ravi kumar{color}{color}

  was:
I configured data import handler  to process bulk PDF documents. after process 
21000 documents. Process going to idle and not  processing all the documents.

When i see the log observed below things (attached log for your reference). 

 

2020-04-23 18:39:55.749 INFO (qtp215219944-24) [ x:DMS] o.a.s.c.S.Request [DMS] 
webapp=/solr path=/dataimport 
params=\{indent=on=json=status&_=1587664092295} status=0 QTime=0
2020-04-23 18:39:55.972 WARN (Thread-14) [ ] o.a.p.p.COSParser 
T{color:#de350b}he end of the stream is out of range, using workaround to read 
the stream, stream start position: 4748210, length: 2007324, expected end 
position: 6755534{color}
2020-04-23 18:39:55.976 WARN (Thread-14) [ ] o.a.p.p.COSParser Removed null 
object COSObject\{50, 0} from pages dictionary
2020-04-23 18:39:55.976 WARN (Thread-14) [ ] o.a.p.p.COSParser Removed null 
object COSObject\{60, 0} from pages dictionary
2020-04-23 18:39:55.997 {color:#de350b}ERROR (Thread-14) [ ] 
o.a.p.c.o.s.SetGraphicsStateParameters name for 'gs' operator not found in 
resources: /R7{color}


> SOLR Dataimport hanlder going to idle after some time
> -
>
> Key: SOLR-14432
> URL: https://issues.apache.org/jira/browse/SOLR-14432
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: contrib - DataImportHandler
>Affects Versions: 8.5.1
> Environment: Windows
>Reporter: Ravi kumar
>Priority: Major
>  Labels: dataimportHandler, solr
> Attachments: solr.log
>
>
> I configured data import handler  to process bulk PDF documents. after 
> process 21000 documents. Process going to idle and not  processing all the 
> documents.
> When i see the log observed below things (attached log for your reference). 
> Please let me know is there anyway that i can ignore this issue or any 
> setting do i need to update.
>  Error:
> 2020-04-23 18:39:55.749 INFO (qtp215219944-24) [ x:DMS] o.a.s.c.S.Request 
> [DMS] webapp=/solr path=/dataimport 
> params=\{indent=on=json=status&_=1587664092295} status=0 QTime=0
>  2020-04-23 18:39:55.972 WARN (Thread-14) [ ] o.a.p.p.COSParser 
> T{color:#de350b}he end of the stream is out of range, using workaround to 
> read the stream, stream start position: 4748210, length: 2007324, expected 
> end position: 6755534{color}
>  2020-04-23 18:39:55.976 WARN (Thread-14) [ ] o.a.p.p.COSParser Removed null 
> object COSObject\{50, 0} from pages dictionary
>  2020-04-23 18:39:55.976 WARN (Thread-14) [ ] o.a.p.p.COSParser Removed null 
> object COSObject\{60, 0} from pages dictionary
>  2020-04-23 18:39:55.997 {color:#de350b}ERROR (Thread-14) [ ] 
> o.a.p.c.o.s.SetGraphicsStateParameters name for 'gs' operator not found in 
> resources: /R7{color}
>  
> {color:#172b4d}{color:#de350b}Regards,{color}{color}
> {color:#172b4d}{color:#de350b}Ravi kumar{color}{color}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (SOLR-14432) SOLR Dataimport hanlder going to idle after some time

2020-04-23 Thread Ravi kumar (Jira)

Ravi kumar created SOLR-14432:
-

 Summary: SOLR Dataimport hanlder going to idle after some time
 Key: SOLR-14432
 URL: https://issues.apache.org/jira/browse/SOLR-14432
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: contrib - DataImportHandler
Affects Versions: 8.5.1
 Environment: Windows
Reporter: Ravi kumar
 Attachments: solr.log

I configured data import handler  to process bulk PDF documents. after process 
21000 documents. Process going to idle and not  processing all the documents.

When i see the log observed below things (attached log for your reference). 

 

2020-04-23 18:39:55.749 INFO (qtp215219944-24) [ x:DMS] o.a.s.c.S.Request [DMS] 
webapp=/solr path=/dataimport 
params=\{indent=on=json=status&_=1587664092295} status=0 QTime=0
2020-04-23 18:39:55.972 WARN (Thread-14) [ ] o.a.p.p.COSParser 
T{color:#de350b}he end of the stream is out of range, using workaround to read 
the stream, stream start position: 4748210, length: 2007324, expected end 
position: 6755534{color}
2020-04-23 18:39:55.976 WARN (Thread-14) [ ] o.a.p.p.COSParser Removed null 
object COSObject\{50, 0} from pages dictionary
2020-04-23 18:39:55.976 WARN (Thread-14) [ ] o.a.p.p.COSParser Removed null 
object COSObject\{60, 0} from pages dictionary
2020-04-23 18:39:55.997 {color:#de350b}ERROR (Thread-14) [ ] 
o.a.p.c.o.s.SetGraphicsStateParameters name for 'gs' operator not found in 
resources: /R7{color}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (SOLR-14431) SegmentsInfoRequestHandler.java does not release IndexWriter

2020-04-23 Thread Tiziano Degaetano (Jira)

Tiziano Degaetano created SOLR-14431:


 Summary: SegmentsInfoRequestHandler.java does not release 
IndexWriter
 Key: SOLR-14431
 URL: https://issues.apache.org/jira/browse/SOLR-14431
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: Admin UI
Affects Versions: 8.5.1, 8.1.1
Reporter: Tiziano Degaetano


If withCoreInfo is false iwRef.decref() will not
be called to release the reader lock, preventing any further writer locks.
https://github.com/apache/lucene-solr/blob/3a743ea953f0ecfc35fc7b198f68d142ce99d789/solr/core/src/java/org/apache/solr/handler/admin/SegmentsInfoRequestHandler.java#L144

Line 130 should be moved inside the if statement L144.

[~ab] FYI



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9342) Collector's totalHitsThreshold should not be lower than numHits

2020-04-23 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090849#comment-17090849
 ] 

ASF subversion and git services commented on LUCENE-9342:
-

Commit a11b78e06a5947ffb43a9b66b37033ebe64753e0 in lucene-solr's branch 
refs/heads/master from Tomas Eduardo Fernandez Lobbe
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=a11b78e ]

LUCENE-9342: Collector's totalHitsThreshold should not be lower than numHits 
(#1448)

Use the maximum of the two, this is so that relation is EQUAL_TO in the case of 
the number of hits in a query is less than the collector's numHits


> Collector's totalHitsThreshold should not be lower than numHits
> ---
>
> Key: LUCENE-9342
> URL: https://issues.apache.org/jira/browse/LUCENE-9342
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Tomas Eduardo Fernandez Lobbe
>Priority: Minor
>
> While looking at SOLR-13289 I noticed this situation. If I create a collector 
> with {{numHits}} greater than {{totalHitsThreshold}}, and the number of hits 
> in the query is somewhere between those two numbers, the collector’s 
> {{totalHitRelation}} will be {{TotalHits.Relation.GREATER_THAN_OR_EQUAL_TO}}, 
> however the count will be accurate in this case. While this doesn't violate 
> the current contract, the {{totalHitRelation}} could be changed to 
> {{TotalHits.Relation.EQUAL_TO}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] madrob commented on issue #1450: SOLR-14429: Convert XXX.txt files to proper XXX.md

2020-04-23 Thread GitBox



madrob commented on issue #1450:
URL: https://github.com/apache/lucene-solr/pull/1450#issuecomment-618573464


   Do we need to update anything in `rat-sources.gradle` or 
`check-source-paterns.groovy`?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] madrob opened a new pull request #1452: Move audit logging docs under AAA section

2020-04-23 Thread GitBox



madrob opened a new pull request #1452:
URL: https://github.com/apache/lucene-solr/pull/1452


   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] s1monw commented on issue #1451: LUCENE-9345: Separate MergeSchedulder from IndexWriter

2020-04-23 Thread GitBox



s1monw commented on issue #1451:
URL: https://github.com/apache/lucene-solr/pull/1451#issuecomment-618546044


   Thanks @jpountz - I was wondering if we should break this API in 8.6 
already. it's very expert IMO. /cc @mikemccand 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] s1monw commented on a change in pull request #1451: LUCENE-9345: Separate MergeSchedulder from IndexWriter

2020-04-23 Thread GitBox



s1monw commented on a change in pull request #1451:
URL: https://github.com/apache/lucene-solr/pull/1451#discussion_r413998721



##
File path: 
lucene/core/src/java/org/apache/lucene/index/ConcurrentMergeScheduler.java
##
@@ -516,18 +519,18 @@ public synchronized void merge(IndexWriter writer, 
MergeTrigger trigger) throws
 
 if (verbose()) {
   message("now merge");
-  message("  index: " + writer.segString());
+  message("  index(source): " + mergeSource.toString());

Review comment:
   I added it on purpose.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-12778) Support encrypted password for ZK cred/ACL providers

2020-04-23 Thread Chris M. Hostetter (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-12778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris M. Hostetter updated SOLR-12778:
--
Attachment: SOLR-12778.patch
Status: Open  (was: Open)

I'm attaching a patch that starts to flesh out support for a new 
"{{zkDigestEncryptFile}}" option used by both 
{{VMParamsAllAndReadonlyDigestZkACLProvider}} and 
{{VMParamsSingleSetCredentialsDigestZkCredentialsProvider}} to decrypt all the 
username/password options they read if specified.

The patch also includes a new {{public static String decodeAES(String 
base64CipherTxt, File encryptFile)}} method in {{CryptoKeys}} wrapping the 
existing {{decodeAES(String base64CipherTxt, String pwd)}} to simplify the 
common code of overhead for plugins like this (but i did not refactor the 
existing File handling code from DIH because it has a lot of code smells i 
didn't want to propogate: assuming limits on the file size, calling {{new 
String(byte[])}}, etc...)

Unfortunately this patch doesn't work at the moment because the {{CryptoKeys}} 
class is in solr-core and these plugins live in solr-solrj.

I know there has ben a lot of concern about hte size & dependencies of solrj, 
so i'm not sure how people will/would feel about migrating CryptoKeys into 
solrj ... i think it can be done w/o increasing the ivy dependencies, but i 
have not yet attempted.

> Support encrypted password for ZK cred/ACL providers
> 
>
> Key: SOLR-12778
> URL: https://issues.apache.org/jira/browse/SOLR-12778
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Reporter: Jan Høydahl
>Priority: Major
> Attachments: SOLR-12778.patch
>
>
> The {{VMParamsSingleSetCredentialsDigestZkCredentialsProvider}} takes a 
> {{zkDigestPassword}} in as a plain-text JVM param, and the 
> {{VMParamsAllAndReadonlyDigestZkACLProvider}} takes both {{zkDigestPassword}} 
> and {{zkDigestReadonlyPassword}}.
> Propose to give an option to encrypt these password using the same mechanism 
> as DIH does:
>  # Add a new VM param "zkDigestPasswordEncryptionKeyFile" which is a path to 
> a file holding the encryption key
>  # Store an encryption key in above mentioned file and restrict access to 
> this file so only Solr user can read it.
>  # Encrypt the ZK passwords using the encryption key and provide the 
> encrypted password in place of the plaintext one
> We could also create a utility command that takes the magic out of encrypting 
> the pw:
> {noformat}
> bin/solr util encrypt [-keyfile ] {noformat}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-8811) Add maximum clause count check to IndexSearcher rather than BooleanQuery

2020-04-23 Thread Alan Woodward (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-8811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090733#comment-17090733
 ] 

Alan Woodward commented on LUCENE-8811:
---

TermInSetQuery is designed to be a more efficient replacement for a boolean 
disjunction of terms, so having it trip the max clauses check would defeat the 
point of having it in the first place.

> Add maximum clause count check to IndexSearcher rather than BooleanQuery
> 
>
> Key: LUCENE-8811
> URL: https://issues.apache.org/jira/browse/LUCENE-8811
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Assignee: Alan Woodward
>Priority: Minor
> Fix For: master (9.0)
>
> Attachments: LUCENE-8811.patch, LUCENE-8811.patch, LUCENE-8811.patch, 
> LUCENE-8811.patch, LUCENE-8811.patch, LUCENE-8811.patch
>
>
> Currently we only check whether boolean queries have too many clauses. 
> However there are other ways that queries may have too many clauses, for 
> instance if you have boolean queries that have themselves inner boolean 
> queries.
> Could we use the new Query visitor API to move this check from BooleanQuery 
> to IndexSearcher in order to make this check more consistent across queries? 
> See for instance LUCENE-8810 where a rewrite rule caused the maximum clause 
> count to be hit even though the total number of leaf queries remained the 
> same.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] romseygeek commented on a change in pull request #1444: LUCENE-9338: Clean up type safety in SimpleBindings

2020-04-23 Thread GitBox



romseygeek commented on a change in pull request #1444:
URL: https://github.com/apache/lucene-solr/pull/1444#discussion_r413936078



##
File path: 
lucene/expressions/src/java/org/apache/lucene/expressions/SimpleBindings.java
##
@@ -96,24 +90,51 @@ public DoubleValuesSource getDoubleValuesSource(String 
name) {
   case SCORE:
 return DoubleValuesSource.SCORES;
   default:
-throw new UnsupportedOperationException(); 
+throw new UnsupportedOperationException();
 }
   }
   
-  /** 
-   * Traverses the graph of bindings, checking there are no cycles or missing 
references 
-   * @throws IllegalArgumentException if the bindings is inconsistent 
+  @Override
+  public DoubleValuesSource getDoubleValuesSource(String name) {
+if (map.containsKey(name) == false) {

Review comment:
   I'm pretty sure this won't be in the hot path - it's part of query 
setup, not query execution - and I think it reads more clearly.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] madrob commented on a change in pull request #1444: LUCENE-9338: Clean up type safety in SimpleBindings

2020-04-23 Thread GitBox



madrob commented on a change in pull request #1444:
URL: https://github.com/apache/lucene-solr/pull/1444#discussion_r413933007



##
File path: 
lucene/expressions/src/java/org/apache/lucene/expressions/SimpleBindings.java
##
@@ -96,24 +90,51 @@ public DoubleValuesSource getDoubleValuesSource(String 
name) {
   case SCORE:
 return DoubleValuesSource.SCORES;
   default:
-throw new UnsupportedOperationException(); 
+throw new UnsupportedOperationException();
 }
   }
   
-  /** 
-   * Traverses the graph of bindings, checking there are no cycles or missing 
references 
-   * @throws IllegalArgumentException if the bindings is inconsistent 
+  @Override
+  public DoubleValuesSource getDoubleValuesSource(String name) {
+if (map.containsKey(name) == false) {

Review comment:
   Is `containsKey` boolean check more clear than `get` with a null check? 
I think the latter is going to be more efficient because it's only a single map 
operation, but I guess this way might be better for the JVM's escape analysis?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-8811) Add maximum clause count check to IndexSearcher rather than BooleanQuery

2020-04-23 Thread Ruben Q L (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-8811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090728#comment-17090728
 ] 

Ruben Q L commented on LUCENE-8811:
---

Friendly reminder to see if someone can answer [~zabetak]'s question in the 
previous comment.

> Add maximum clause count check to IndexSearcher rather than BooleanQuery
> 
>
> Key: LUCENE-8811
> URL: https://issues.apache.org/jira/browse/LUCENE-8811
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Assignee: Alan Woodward
>Priority: Minor
> Fix For: master (9.0)
>
> Attachments: LUCENE-8811.patch, LUCENE-8811.patch, LUCENE-8811.patch, 
> LUCENE-8811.patch, LUCENE-8811.patch, LUCENE-8811.patch
>
>
> Currently we only check whether boolean queries have too many clauses. 
> However there are other ways that queries may have too many clauses, for 
> instance if you have boolean queries that have themselves inner boolean 
> queries.
> Could we use the new Query visitor API to move this check from BooleanQuery 
> to IndexSearcher in order to make this check more consistent across queries? 
> See for instance LUCENE-8810 where a rewrite rule caused the maximum clause 
> count to be hit even though the total number of leaf queries remained the 
> same.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14428) FuzzyQuery has severe memory usage in 8.5

2020-04-23 Thread Mike Drob (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090721#comment-17090721
 ] 

Mike Drob commented on SOLR-14428:
--

If there are no terms in the index, then the fuzzy query should be collapsing 
pretty quickly and there would be no reason for it to take up so much memory. 
Do we only do that at query processing time now? I thought we would be doing 
that as aggressively as possible.

> FuzzyQuery has severe memory usage in 8.5
> -
>
> Key: SOLR-14428
> URL: https://issues.apache.org/jira/browse/SOLR-14428
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 8.5, 8.5.1
>Reporter: Colvin Cowie
>Assignee: Andrzej Bialecki
>Priority: Major
> Attachments: FuzzyHammer.java, image-2020-04-23-09-18-06-070.png, 
> screenshot-2.png, screenshot-3.png, screenshot-4.png
>
>
> I sent this to the mailing list
> I'm moving from 8.3.1 to 8.5.1, and started getting Out Of Memory Errors 
> while running our normal tests. After profiling it was clear that the 
> majority of the heap was allocated through FuzzyQuery.
> LUCENE-9068 moved construction of the automata from the FuzzyTermsEnum to the 
> FuzzyQuery's constructor.
> I created a little test ( [^FuzzyHammer.java] ) that fires off fuzzy queries 
> from random UUID strings for 5 minutes
> {code}
> FIELD_NAME + ":" + UUID.randomUUID().toString().replace("-", "") + "~2"
> {code}
> When running against a vanilla Solr 8.31 and 8.4.1 there is no problem, while 
> the memory usage has increased drastically on 8.5.0 and 8.5.1.
> Comparison of heap usage while running the attached test against Solr 8.3.1 
> and 8.5.1 with a single (empty) shard and 4GB heap:
> !image-2020-04-23-09-18-06-070.png! 
> And with 4 shards on 8.4.1 and 8.5.0:
>  !screenshot-2.png! 
> I'm guessing that the memory might be being leaked if the FuzzyQuery objects 
> are referenced from the cache, while the FuzzyTermsEnum would not have been.
> Query Result Cache on 8.5.1:
>  !screenshot-3.png! 
> ~316mb in the cache
> QRC on 8.3.1
>  !screenshot-4.png! 
> <1mb
> With an empty cache, running this query 
> _field_s:e41848af85d24ac197c71db6888e17bc~2_ results in the following memory 
> allocation
> {noformat}
> 8.3.1: CACHE.searcher.queryResultCache.ramBytesUsed:  1520
> 8.5.1: CACHE.searcher.queryResultCache.ramBytesUsed:648855
> {noformat}
> ~1 gives 98253 and ~0 gives 6339 on 8.5.1. 8.3.1 is constant at 1520



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (SOLR-14430) Authorization plugins should check roles from request

2020-04-23 Thread Mike Drob (Jira)

Mike Drob created SOLR-14430:


 Summary: Authorization plugins should check roles from request
 Key: SOLR-14430
 URL: https://issues.apache.org/jira/browse/SOLR-14430
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
  Components: security
Reporter: Mike Drob


The AuthorizationContext exposes {{getUserPrincipal}} to the plugin, but it 
does not allow the plugin to interrogate the request for {{isUserInRole}}. If 
we trust the request enough to get a principal from it, then we should trust it 
enough to ask about roles, as those could have been defined and verified by an 
authentication plugin.

This model would be an alternative to the current model where 
RuleBasedAuthorizationPlugin maintains its own user->role mapping.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] romseygeek commented on a change in pull request #1444: LUCENE-9338: Clean up type safety in SimpleBindings

2020-04-23 Thread GitBox



romseygeek commented on a change in pull request #1444:
URL: https://github.com/apache/lucene-solr/pull/1444#discussion_r413911753



##
File path: 
lucene/expressions/src/java/org/apache/lucene/expressions/SimpleBindings.java
##
@@ -96,24 +90,51 @@ public DoubleValuesSource getDoubleValuesSource(String 
name) {
   case SCORE:
 return DoubleValuesSource.SCORES;
   default:
-throw new UnsupportedOperationException(); 
+throw new UnsupportedOperationException();
 }
   }
   
-  /** 
-   * Traverses the graph of bindings, checking there are no cycles or missing 
references 
-   * @throws IllegalArgumentException if the bindings is inconsistent 
+  @Override
+  public DoubleValuesSource getDoubleValuesSource(String name) {
+if (map.containsKey(name) == false) {
+  throw new IllegalArgumentException("Invalid reference '" + name + "'");
+}
+return map.get(name).apply(this);
+  }
+
+  /**
+   * Traverses the graph of bindings, checking there are no cycles or missing 
references
+   * @throws IllegalArgumentException if the bindings is inconsistent
*/
   public void validate() {

Review comment:
   Cacheing is a whole other conversation, which I think is related to the 
stuff that @mkhludnev is working on around grouping (in that we could plausibly 
have multiple references to the same iterator all moving in lockstep, where at 
the moment we pull separate iterators for each reference).  But I think that's 
for a follow-up really, this issue is just a bit of refactoring to improve type 
safety.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] msokolov commented on a change in pull request #1444: LUCENE-9338: Clean up type safety in SimpleBindings

2020-04-23 Thread GitBox



msokolov commented on a change in pull request #1444:
URL: https://github.com/apache/lucene-solr/pull/1444#discussion_r413907462



##
File path: 
lucene/expressions/src/java/org/apache/lucene/expressions/SimpleBindings.java
##
@@ -96,24 +90,51 @@ public DoubleValuesSource getDoubleValuesSource(String 
name) {
   case SCORE:
 return DoubleValuesSource.SCORES;
   default:
-throw new UnsupportedOperationException(); 
+throw new UnsupportedOperationException();
 }
   }
   
-  /** 
-   * Traverses the graph of bindings, checking there are no cycles or missing 
references 
-   * @throws IllegalArgumentException if the bindings is inconsistent 
+  @Override
+  public DoubleValuesSource getDoubleValuesSource(String name) {
+if (map.containsKey(name) == false) {
+  throw new IllegalArgumentException("Invalid reference '" + name + "'");
+}
+return map.get(name).apply(this);
+  }
+
+  /**
+   * Traverses the graph of bindings, checking there are no cycles or missing 
references
+   * @throws IllegalArgumentException if the bindings is inconsistent
*/
   public void validate() {

Review comment:
   Have you considered returning the map, or an immutable view on it, so 
that callers can use this to enumerate all the dependencies? In a similar 
framework, I've found this to be pretty helpful for analyzing query patterns. 
It's also nice to know if the same name occurs multiple times in the dependency 
tree; maybe one should cache its value in that case.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] mhitza edited a comment on issue #1435: SOLR-14410: Switch from SysV init script to systemd service file

2020-04-23 Thread GitBox



mhitza edited a comment on issue #1435:
URL: https://github.com/apache/lucene-solr/pull/1435#issuecomment-618450549


   @janhoy just a quick ping, the PR is ready for a new review
   
   edit: just saw the Re-request review button after posting my comment, oh 
well :)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] mhitza commented on issue #1435: SOLR-14410: Switch from SysV init script to systemd service file

2020-04-23 Thread GitBox



mhitza commented on issue #1435:
URL: https://github.com/apache/lucene-solr/pull/1435#issuecomment-618450549


   @janhoy just a quick ping, the PR is ready for a new review



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14413) allow timeAllowed and cursorMark parameters

2020-04-23 Thread Mike Drob (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090672#comment-17090672
 ] 

Mike Drob commented on SOLR-14413:
--

That's awesome, thanks for verifying! Definitely update the docs with that 
info, please!

The one concern I have at this point is about the return of zero results. 
Typically returning the same cursor indicates that we've reached the end of the 
results. Is there a way to distinguish the real end of the results from the 
case where we do not get any results in the time allowed? I know that we have 
the {{partialResults}} header there, but could there be a case where the 
opposite is true? We return partialResults:true, but there actually are no more 
results? Again, probably documenting the permutations here is sufficient.

Also, can we add tests that explicitly demonstrate a partial cursor mark 
working?

> allow timeAllowed and cursorMark parameters
> ---
>
> Key: SOLR-14413
> URL: https://issues.apache.org/jira/browse/SOLR-14413
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: search
>Reporter: John Gallagher
>Priority: Minor
> Attachments: SOLR-14413.patch, timeallowed_cursormarks_results.txt
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Ever since cursorMarks were introduced in SOLR-5463 in 2014, cursorMark and 
> timeAllowed parameters were not allowed in combination ("Can not search using 
> both cursorMark and timeAllowed")
> , from [QueryComponent.java|#L359]]:
>  
> {code:java}
>  
>  if (null != rb.getCursorMark() && 0 < timeAllowed) {
>   // fundamentally incompatible
>   throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, "Can not 
> search using both " + CursorMarkParams.CURSOR_MARK_PARAM + " and " + 
> CommonParams.TIME_ALLOWED);
> } {code}
> While theoretically impure to use them in combination, it is often desirable 
> to support cursormarks-style deep paging and attempt to protect Solr nodes 
> from runaway queries using timeAllowed, in the hopes that most of the time, 
> the query completes in the allotted time, and there is no conflict.
>  
> However if the query takes too long, it may be preferable to end the query 
> and protect the Solr node and provide the user with a somewhat inaccurate 
> sorted list. As noted in SOLR-6930, SOLR-5986 and others, timeAllowed is 
> frequently used to prevent runaway load.  In fact, cursorMark and 
> shards.tolerant are allowed in combination, so any argument in favor of 
> purity would be a bit muddied in my opinion.
>  
> This was discussed once in the mailing list that I can find: 
> [https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201506.mbox/%3c5591740b.4080...@elyograg.org%3E]
>  It did not look like there was strong support for preventing the combination.
>  
> I have tested cursorMark and timeAllowed combination together, and even when 
> partial results are returned because the timeAllowed is exceeded, the 
> cursorMark response value is still valid and reasonable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] dsmiley commented on issue #1351: LUCENE-9280: Collectors to skip noncompetitive documents

2020-04-23 Thread GitBox



dsmiley commented on issue #1351:
URL: https://github.com/apache/lucene-solr/pull/1351#issuecomment-618441539


   Okay; I get your point about noise.  I think it's also true that smaller 
(realistic) data sets may expose how well code can scale down and make 
different / better choices than it should make for larger sizes; and that's not 
noise.  So both large and small data sets matter for benchmarking.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14429) Convert XXX.txt files to proper XXX.md

2020-04-23 Thread Tomoko Uchida (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090663#comment-17090663
 ] 

Tomoko Uchida commented on SOLR-14429:
--

The branch passed {{nightly-smoke}}.
{code}
[smoker] SUCCESS! [0:43:25.703442]
{code}

> Convert XXX.txt files to proper XXX.md
> --
>
> Key: SOLR-14429
> URL: https://issues.apache.org/jira/browse/SOLR-14429
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tomoko Uchida
>Assignee: Tomoko Uchida
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> "README.txt" files are (partially) written in markdown and can be converted 
> to proper markdown files. This change was suggested on LUCENE-9321.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] romseygeek commented on issue #1444: LUCENE-9338: Clean up type safety in SimpleBindings

2020-04-23 Thread GitBox



romseygeek commented on issue #1444:
URL: https://github.com/apache/lucene-solr/pull/1444#issuecomment-618403151


   Tricksy, tricksy... I've updated the cycle detection logic to handle 
multiple levels of recursion, and as a bonus we get a nicer error message as 
well.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] jpountz commented on issue #1351: LUCENE-9280: Collectors to skip noncompetitive documents

2020-04-23 Thread GitBox



jpountz commented on issue #1351:
URL: https://github.com/apache/lucene-solr/pull/1351#issuecomment-618405559


   > Should we infer that you don't think a 1M doc corpus is realistic in many 
production settings of Lucene?
   
   It's certainly realistic, but I think that the point still holds that these 
collections are not very useful for benchmarking as they tend to be more noisy 
and can easily miss improvements as even a linear scan is fast on a small 
collection?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9344) Convert XXX.txt files to proper XXX.md

2020-04-23 Thread Tomoko Uchida (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090629#comment-17090629
 ] 

Tomoko Uchida commented on LUCENE-9344:
---

The branch passed {{nightly-smoke.}}
{code:java}
[smoker] SUCCESS! [0:44:59.564581]
{code}

> Convert  XXX.txt files to proper XXX.md
> ---
>
> Key: LUCENE-9344
> URL: https://issues.apache.org/jira/browse/LUCENE-9344
> Project: Lucene - Core
>  Issue Type: Improvement
>Affects Versions: master (9.0)
>Reporter: Tomoko Uchida
>Assignee: Tomoko Uchida
>Priority: Minor
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Text files that are (partially) written in markdown (such as "README.txt") 
> can be converted to proper markdown files. This change was suggested on 
> LUCENE-9321.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] jpountz commented on issue #1444: LUCENE-9338: Clean up type safety in SimpleBindings

2020-04-23 Thread GitBox



jpountz commented on issue #1444:
URL: https://github.com/apache/lucene-solr/pull/1444#issuecomment-618396995


   I managed to defeat the new validation logic with this test:
   
   ```
 public void testCoRecursion42() throws Exception {
   SimpleBindings bindings = new SimpleBindings();
   bindings.add("cycle2", JavascriptCompiler.compile("cycle1"));
   bindings.add("cycle1", JavascriptCompiler.compile("cycle0"));
   bindings.add("cycle0", JavascriptCompiler.compile("cycle1"));
   IllegalArgumentException expected = 
expectThrows(IllegalArgumentException.class, () -> {
 bindings.validate();
   });
   assertTrue(expected.getMessage().contains("Cycle detected"));
 }
   ```
   
   It depends on HashMap iteration order, so it might not reproduce for you, 
but the issue is that `cycle2` gets validated first. And as you recursively 
create expressions for bindings for `cycle2`, there is an infinite recursive 
loop, but it only includes `cycle0` and `cycle1` so we might need to track the 
names of the expressions in a set as we recursively resolve bindings to catch 
such cases too?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] jpountz commented on issue #1444: LUCENE-9338: Clean up type safety in SimpleBindings

2020-04-23 Thread GitBox



jpountz commented on issue #1444:
URL: https://github.com/apache/lucene-solr/pull/1444#issuecomment-618396405


   I managed to defeat the new validation logic with this test:
   
   ```
 public void testCoRecursion42() throws Exception {
   SimpleBindings bindings = new SimpleBindings();
   bindings.add("cycle2", JavascriptCompiler.compile("cycle1"));
   bindings.add("cycle1", JavascriptCompiler.compile("cycle0"));
   bindings.add("cycle0", JavascriptCompiler.compile("cycle1"));
   IllegalArgumentException expected = 
expectThrows(IllegalArgumentException.class, () -> {
 bindings.validate();
   });
   assertTrue(expected.getMessage().contains("Cycle detected"));
 }
   ```
   
   It depends on HashMap iteration order, so it might not reproduce for you, 
but the issue is that `cycle2` gets validated first. And as you recursively 
create expressions for bindings for `cycle2`, there is an infinite recursive 
loop, but it only includes `cycle0` and `cycle1` so we might need to track the 
names of the expressions in a set as we recursively resolve bindings to catch 
such cases too?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14428) FuzzyQuery has severe memory usage in 8.5

2020-04-23 Thread Colvin Cowie (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090609#comment-17090609
 ] 

Colvin Cowie commented on SOLR-14428:
-

Thanks, we'll just stick on 8.3.1 for the time being.
Though I will look at moving to CaffeineCache in general since I see the other 
caches are being removed anyway.
Cheers

> FuzzyQuery has severe memory usage in 8.5
> -
>
> Key: SOLR-14428
> URL: https://issues.apache.org/jira/browse/SOLR-14428
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 8.5, 8.5.1
>Reporter: Colvin Cowie
>Assignee: Andrzej Bialecki
>Priority: Major
> Attachments: FuzzyHammer.java, image-2020-04-23-09-18-06-070.png, 
> screenshot-2.png, screenshot-3.png, screenshot-4.png
>
>
> I sent this to the mailing list
> I'm moving from 8.3.1 to 8.5.1, and started getting Out Of Memory Errors 
> while running our normal tests. After profiling it was clear that the 
> majority of the heap was allocated through FuzzyQuery.
> LUCENE-9068 moved construction of the automata from the FuzzyTermsEnum to the 
> FuzzyQuery's constructor.
> I created a little test ( [^FuzzyHammer.java] ) that fires off fuzzy queries 
> from random UUID strings for 5 minutes
> {code}
> FIELD_NAME + ":" + UUID.randomUUID().toString().replace("-", "") + "~2"
> {code}
> When running against a vanilla Solr 8.31 and 8.4.1 there is no problem, while 
> the memory usage has increased drastically on 8.5.0 and 8.5.1.
> Comparison of heap usage while running the attached test against Solr 8.3.1 
> and 8.5.1 with a single (empty) shard and 4GB heap:
> !image-2020-04-23-09-18-06-070.png! 
> And with 4 shards on 8.4.1 and 8.5.0:
>  !screenshot-2.png! 
> I'm guessing that the memory might be being leaked if the FuzzyQuery objects 
> are referenced from the cache, while the FuzzyTermsEnum would not have been.
> Query Result Cache on 8.5.1:
>  !screenshot-3.png! 
> ~316mb in the cache
> QRC on 8.3.1
>  !screenshot-4.png! 
> <1mb
> With an empty cache, running this query 
> _field_s:e41848af85d24ac197c71db6888e17bc~2_ results in the following memory 
> allocation
> {noformat}
> 8.3.1: CACHE.searcher.queryResultCache.ramBytesUsed:  1520
> 8.5.1: CACHE.searcher.queryResultCache.ramBytesUsed:648855
> {noformat}
> ~1 gives 98253 and ~0 gives 6339 on 8.5.1. 8.3.1 is constant at 1520



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] jpountz commented on a change in pull request #1451: LUCENE-9345: Separate MergeSchedulder from IndexWriter

2020-04-23 Thread GitBox



jpountz commented on a change in pull request #1451:
URL: https://github.com/apache/lucene-solr/pull/1451#discussion_r413791931



##
File path: 
lucene/core/src/java/org/apache/lucene/index/ConcurrentMergeScheduler.java
##
@@ -516,18 +519,18 @@ public synchronized void merge(IndexWriter writer, 
MergeTrigger trigger) throws
 
 if (verbose()) {
   message("now merge");
-  message("  index: " + writer.segString());
+  message("  index(source): " + mergeSource.toString());

Review comment:
   did you mean to add `(source)` after `index` or is it a side-effect of a 
search/replace?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-14428) FuzzyQuery has severe memory usage in 8.5

2020-04-23 Thread Andrzej Bialecki (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090564#comment-17090564
 ] 

Andrzej Bialecki edited comment on SOLR-14428 at 4/23/20, 12:22 PM:


[~cjcowie] as a temporary workaround you can switch to using {{CaffeineCache}} 
and see if it behaves differently. Also, configuring the queryResultCache using 
{{maxRamMB}} instead of {{maxSize}} should cap the max RAM usage. Of course, 
these are just stopgaps, they don't address the underlying issue.

[~romseygeek] Solr doesn't yet support any cache admission policy / gating, 
unfortunately. That would be a nice improvement. There's a robust admission 
policy in CaffeineCache but it serves a slightly different purpose - it 
considers usage patterns when deciding what items to evict first. It would be 
nice to also be able to automatically avoid caching objects that are eg. too 
large, or cheap to compute and large. 

I'll look into this in a few days (need to wrap up other stuff).


was (Author: ab):
[~cjcowie] as a temporary workaround you can switch to using {{CaffeineCache}} 
and see if it behaves differently. Also, configuring the queryResultCache using 
{{maxRamMB}} instead of {{maxSize}} should cap the max RAM usage. Of course, 
these are just stopgaps, they don't address the underlying issue.

I'll look into this in a few days (need to wrap up other stuff).

> FuzzyQuery has severe memory usage in 8.5
> -
>
> Key: SOLR-14428
> URL: https://issues.apache.org/jira/browse/SOLR-14428
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 8.5, 8.5.1
>Reporter: Colvin Cowie
>Assignee: Andrzej Bialecki
>Priority: Major
> Attachments: FuzzyHammer.java, image-2020-04-23-09-18-06-070.png, 
> screenshot-2.png, screenshot-3.png, screenshot-4.png
>
>
> I sent this to the mailing list
> I'm moving from 8.3.1 to 8.5.1, and started getting Out Of Memory Errors 
> while running our normal tests. After profiling it was clear that the 
> majority of the heap was allocated through FuzzyQuery.
> LUCENE-9068 moved construction of the automata from the FuzzyTermsEnum to the 
> FuzzyQuery's constructor.
> I created a little test ( [^FuzzyHammer.java] ) that fires off fuzzy queries 
> from random UUID strings for 5 minutes
> {code}
> FIELD_NAME + ":" + UUID.randomUUID().toString().replace("-", "") + "~2"
> {code}
> When running against a vanilla Solr 8.31 and 8.4.1 there is no problem, while 
> the memory usage has increased drastically on 8.5.0 and 8.5.1.
> Comparison of heap usage while running the attached test against Solr 8.3.1 
> and 8.5.1 with a single (empty) shard and 4GB heap:
> !image-2020-04-23-09-18-06-070.png! 
> And with 4 shards on 8.4.1 and 8.5.0:
>  !screenshot-2.png! 
> I'm guessing that the memory might be being leaked if the FuzzyQuery objects 
> are referenced from the cache, while the FuzzyTermsEnum would not have been.
> Query Result Cache on 8.5.1:
>  !screenshot-3.png! 
> ~316mb in the cache
> QRC on 8.3.1
>  !screenshot-4.png! 
> <1mb
> With an empty cache, running this query 
> _field_s:e41848af85d24ac197c71db6888e17bc~2_ results in the following memory 
> allocation
> {noformat}
> 8.3.1: CACHE.searcher.queryResultCache.ramBytesUsed:  1520
> 8.5.1: CACHE.searcher.queryResultCache.ramBytesUsed:648855
> {noformat}
> ~1 gives 98253 and ~0 gives 6339 on 8.5.1. 8.3.1 is constant at 1520



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Assigned] (SOLR-14428) FuzzyQuery has severe memory usage in 8.5

2020-04-23 Thread Andrzej Bialecki (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrzej Bialecki reassigned SOLR-14428:
---

Assignee: Andrzej Bialecki

> FuzzyQuery has severe memory usage in 8.5
> -
>
> Key: SOLR-14428
> URL: https://issues.apache.org/jira/browse/SOLR-14428
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 8.5, 8.5.1
>Reporter: Colvin Cowie
>Assignee: Andrzej Bialecki
>Priority: Major
> Attachments: FuzzyHammer.java, image-2020-04-23-09-18-06-070.png, 
> screenshot-2.png, screenshot-3.png, screenshot-4.png
>
>
> I sent this to the mailing list
> I'm moving from 8.3.1 to 8.5.1, and started getting Out Of Memory Errors 
> while running our normal tests. After profiling it was clear that the 
> majority of the heap was allocated through FuzzyQuery.
> LUCENE-9068 moved construction of the automata from the FuzzyTermsEnum to the 
> FuzzyQuery's constructor.
> I created a little test ( [^FuzzyHammer.java] ) that fires off fuzzy queries 
> from random UUID strings for 5 minutes
> {code}
> FIELD_NAME + ":" + UUID.randomUUID().toString().replace("-", "") + "~2"
> {code}
> When running against a vanilla Solr 8.31 and 8.4.1 there is no problem, while 
> the memory usage has increased drastically on 8.5.0 and 8.5.1.
> Comparison of heap usage while running the attached test against Solr 8.3.1 
> and 8.5.1 with a single (empty) shard and 4GB heap:
> !image-2020-04-23-09-18-06-070.png! 
> And with 4 shards on 8.4.1 and 8.5.0:
>  !screenshot-2.png! 
> I'm guessing that the memory might be being leaked if the FuzzyQuery objects 
> are referenced from the cache, while the FuzzyTermsEnum would not have been.
> Query Result Cache on 8.5.1:
>  !screenshot-3.png! 
> ~316mb in the cache
> QRC on 8.3.1
>  !screenshot-4.png! 
> <1mb
> With an empty cache, running this query 
> _field_s:e41848af85d24ac197c71db6888e17bc~2_ results in the following memory 
> allocation
> {noformat}
> 8.3.1: CACHE.searcher.queryResultCache.ramBytesUsed:  1520
> 8.5.1: CACHE.searcher.queryResultCache.ramBytesUsed:648855
> {noformat}
> ~1 gives 98253 and ~0 gives 6339 on 8.5.1. 8.3.1 is constant at 1520



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14428) FuzzyQuery has severe memory usage in 8.5

2020-04-23 Thread Andrzej Bialecki (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090564#comment-17090564
 ] 

Andrzej Bialecki commented on SOLR-14428:
-

[~cjcowie] as a temporary workaround you can switch to using {{CaffeineCache}} 
and see if it behaves differently. Also, configuring the queryResultCache using 
{{maxRamMB}} instead of {{maxSize}} should cap the max RAM usage. Of course, 
these are just stopgaps, they don't address the underlying issue.

I'll look into this in a few days (need to wrap up other stuff).

> FuzzyQuery has severe memory usage in 8.5
> -
>
> Key: SOLR-14428
> URL: https://issues.apache.org/jira/browse/SOLR-14428
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 8.5, 8.5.1
>Reporter: Colvin Cowie
>Priority: Major
> Attachments: FuzzyHammer.java, image-2020-04-23-09-18-06-070.png, 
> screenshot-2.png, screenshot-3.png, screenshot-4.png
>
>
> I sent this to the mailing list
> I'm moving from 8.3.1 to 8.5.1, and started getting Out Of Memory Errors 
> while running our normal tests. After profiling it was clear that the 
> majority of the heap was allocated through FuzzyQuery.
> LUCENE-9068 moved construction of the automata from the FuzzyTermsEnum to the 
> FuzzyQuery's constructor.
> I created a little test ( [^FuzzyHammer.java] ) that fires off fuzzy queries 
> from random UUID strings for 5 minutes
> {code}
> FIELD_NAME + ":" + UUID.randomUUID().toString().replace("-", "") + "~2"
> {code}
> When running against a vanilla Solr 8.31 and 8.4.1 there is no problem, while 
> the memory usage has increased drastically on 8.5.0 and 8.5.1.
> Comparison of heap usage while running the attached test against Solr 8.3.1 
> and 8.5.1 with a single (empty) shard and 4GB heap:
> !image-2020-04-23-09-18-06-070.png! 
> And with 4 shards on 8.4.1 and 8.5.0:
>  !screenshot-2.png! 
> I'm guessing that the memory might be being leaked if the FuzzyQuery objects 
> are referenced from the cache, while the FuzzyTermsEnum would not have been.
> Query Result Cache on 8.5.1:
>  !screenshot-3.png! 
> ~316mb in the cache
> QRC on 8.3.1
>  !screenshot-4.png! 
> <1mb
> With an empty cache, running this query 
> _field_s:e41848af85d24ac197c71db6888e17bc~2_ results in the following memory 
> allocation
> {noformat}
> 8.3.1: CACHE.searcher.queryResultCache.ramBytesUsed:  1520
> 8.5.1: CACHE.searcher.queryResultCache.ramBytesUsed:648855
> {noformat}
> ~1 gives 98253 and ~0 gives 6339 on 8.5.1. 8.3.1 is constant at 1520



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-12690) Regularize LoggerFactory declarations

2020-04-23 Thread Erick Erickson (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-12690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090546#comment-17090546
 ] 

Erick Erickson commented on SOLR-12690:
---

Fixed, thanks for catching!

> Regularize LoggerFactory declarations
> -
>
> Key: SOLR-12690
> URL: https://issues.apache.org/jira/browse/SOLR-12690
> Project: Solr
>  Issue Type: Improvement
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Minor
> Fix For: 7.5, 8.0
>
> Attachments: SOLR-12690.patch, SOLR-12690.patch
>
>
> LoggerFactory declarations have several different forms, they should all be:
> private static final Logger log = 
> LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
>  * lowercase log
>  * private static
>  * non hard-coded class lookup.
> I'm going to regularize all of these, I think there are about 80 currently, 
> we've been nibbling away at this but I'll try to do it in one go.
> [~cpoerschke] I think there's another Jira about this that I can't find just 
> now, ring any bells?
> Once that's done, is there a good way to make violations of this fail 
> precommit?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-12690) Regularize LoggerFactory declarations

2020-04-23 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-12690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090545#comment-17090545
 ] 

ASF subversion and git services commented on SOLR-12690:


Commit eb8d3d3a0f2e039a64e74b296eb64da2ae530800 in lucene-solr's branch 
refs/heads/branch_8x from Erick Erickson
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=eb8d3d3 ]

SOLR-12690: Regularize LoggerFactory declarations. Fixing an incorrect change


> Regularize LoggerFactory declarations
> -
>
> Key: SOLR-12690
> URL: https://issues.apache.org/jira/browse/SOLR-12690
> Project: Solr
>  Issue Type: Improvement
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Minor
> Fix For: 7.5, 8.0
>
> Attachments: SOLR-12690.patch, SOLR-12690.patch
>
>
> LoggerFactory declarations have several different forms, they should all be:
> private static final Logger log = 
> LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
>  * lowercase log
>  * private static
>  * non hard-coded class lookup.
> I'm going to regularize all of these, I think there are about 80 currently, 
> we've been nibbling away at this but I'll try to do it in one go.
> [~cpoerschke] I think there's another Jira about this that I can't find just 
> now, ring any bells?
> Once that's done, is there a good way to make violations of this fail 
> precommit?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-12690) Regularize LoggerFactory declarations

2020-04-23 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-12690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090544#comment-17090544
 ] 

ASF subversion and git services commented on SOLR-12690:


Commit 4eb755db18f6a605bf62e1c8f029093ad8d6ca7b in lucene-solr's branch 
refs/heads/master from Erick Erickson
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=4eb755d ]

SOLR-12690: Regularize LoggerFactory declarations. Fixing an incorrect change


> Regularize LoggerFactory declarations
> -
>
> Key: SOLR-12690
> URL: https://issues.apache.org/jira/browse/SOLR-12690
> Project: Solr
>  Issue Type: Improvement
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Minor
> Fix For: 7.5, 8.0
>
> Attachments: SOLR-12690.patch, SOLR-12690.patch
>
>
> LoggerFactory declarations have several different forms, they should all be:
> private static final Logger log = 
> LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
>  * lowercase log
>  * private static
>  * non hard-coded class lookup.
> I'm going to regularize all of these, I think there are about 80 currently, 
> we've been nibbling away at this but I'll try to do it in one go.
> [~cpoerschke] I think there's another Jira about this that I can't find just 
> now, ring any bells?
> Once that's done, is there a good way to make violations of this fail 
> precommit?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] jpountz commented on issue #1440: LUCENE-9330: Make SortFields responsible for index sorting and serialization

2020-04-23 Thread GitBox



jpountz commented on issue #1440:
URL: https://github.com/apache/lucene-solr/pull/1440#issuecomment-618356399


   > This is trickier for the segment info format because both reading and 
writing are handled by the same class. I think so far we've only done this for 
Postings formats, and not for other parts of the Codec?
   
   We've done it in the past already, see e.g. 
https://github.com/apache/lucene-solr/commit/23b002a0fdf2f6025f1eb026c0afca247fb21ed0.
 LuceneXXSegmentInfoFormat is changed to throw an UOE in the `write` method, 
then a LuceneXXRWSegmentInfoFormat is created that extends 
LuceneXXSegmentInfoFormat and adds back the `write` implementation.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] jpountz commented on a change in pull request #1440: LUCENE-9330: Make SortFields responsible for index sorting and serialization

2020-04-23 Thread GitBox



jpountz commented on a change in pull request #1440:
URL: https://github.com/apache/lucene-solr/pull/1440#discussion_r413746031



##
File path: 
lucene/core/src/java/org/apache/lucene/index/DefaultIndexingChain.java
##
@@ -527,45 +589,63 @@ private void indexPoint(PerField fp, IndexableField 
field) throws IOException {
 fp.pointValuesWriter.addPackedValue(docState.docID, field.binaryValue());
   }
 
-  private void validateIndexSortDVType(Sort indexSort, String fieldName, 
DocValuesType dvType) {
+  private void validateIndexSortDVType(Sort indexSort, String fieldToValidate, 
DocValuesType dvType) throws IOException {
 for (SortField sortField : indexSort.getSort()) {
-  if (sortField.getField().equals(fieldName)) {
-switch (dvType) {
-  case NUMERIC:
-if (sortField.getType().equals(SortField.Type.INT) == false &&
-  sortField.getType().equals(SortField.Type.LONG) == false &&
-  sortField.getType().equals(SortField.Type.FLOAT) == false &&
-  sortField.getType().equals(SortField.Type.DOUBLE) == false) {
-  throw new IllegalArgumentException("invalid doc value type:" + 
dvType + " for sortField:" + sortField);
-}
-break;
+  IndexSorter sorter = sortField.getIndexSorter();
+  if (sorter == null) {
+throw new IllegalStateException("Cannot sort index with sort order " + 
sortField);
+  }
+  sorter.getDocComparator(new DocValuesLeafReader() {
+@Override
+public NumericDocValues getNumericDocValues(String field) {
+  if (Objects.equals(field, fieldToValidate) && dvType != 
DocValuesType.NUMERIC) {
+throw new IllegalArgumentException("SortField " + sortField + " 
expected field [" + field + "] to be NUMERIC but it is [" + dvType + "]");
+  }
+  return DocValues.emptyNumeric();
+}
 
-  case BINARY:
-throw new IllegalArgumentException("invalid doc value type:" + 
dvType + " for sortField:" + sortField);
+@Override
+public BinaryDocValues getBinaryDocValues(String field) {
+  if (Objects.equals(field, fieldToValidate) && dvType != 
DocValuesType.BINARY) {
+throw new IllegalArgumentException("SortField " + sortField + " 
expected field [" + field + "] to be BINARY but it is [" + dvType + "]");
+  }
+  return DocValues.emptyBinary();
+}
 
-  case SORTED:
-if (sortField.getType().equals(SortField.Type.STRING) == false) {
-  throw new IllegalArgumentException("invalid doc value type:" + 
dvType + " for sortField:" + sortField);
-}
-break;
+@Override
+public SortedDocValues getSortedDocValues(String field) {
+  if (Objects.equals(field, fieldToValidate) && dvType != 
DocValuesType.SORTED) {
+throw new IllegalArgumentException("SortField " + sortField + " 
expected field [" + field + "] to be SORTED but it is [" + dvType + "]");
+  }
+  return DocValues.emptySorted();
+}
 
-  case SORTED_NUMERIC:
-if (sortField instanceof SortedNumericSortField == false) {
-  throw new IllegalArgumentException("invalid doc value type:" + 
dvType + " for sortField:" + sortField);
-}
-break;
+@Override
+public SortedNumericDocValues getSortedNumericDocValues(String field) {
+  if (Objects.equals(field, fieldToValidate) && dvType != 
DocValuesType.SORTED_NUMERIC) {
+throw new IllegalArgumentException("SortField " + sortField + " 
expected field [" + field + "] to be SORTED_NUMERIC but it is [" + dvType + 
"]");
+  }
+  return DocValues.emptySortedNumeric(0);
+}
 
-  case SORTED_SET:
-if (sortField instanceof SortedSetSortField == false) {
-  throw new IllegalArgumentException("invalid doc value type:" + 
dvType + " for sortField:" + sortField);
-}
-break;
+@Override
+public SortedSetDocValues getSortedSetDocValues(String field) {
+  if (Objects.equals(field, fieldToValidate) && dvType != 
DocValuesType.SORTED_SET) {
+throw new IllegalArgumentException("SortField " + sortField + " 
expected field [" + field + "] to be SORTED_SET but it is [" + dvType + "]");
+  }
+  return DocValues.emptySortedSet();
+}
 
-  default:
-throw new IllegalArgumentException("invalid doc value type:" + 
dvType + " for sortField:" + sortField);
+@Override
+public FieldInfos getFieldInfos() {
+  throw new UnsupportedOperationException();
 }
-break;
-  }
+
+@Override
+public int maxDoc() {
+  return 0;

Review comment:
   +1





This is an automated message from the

[jira] [Commented] (SOLR-14428) FuzzyQuery has severe memory usage in 8.5

2020-04-23 Thread Alan Woodward (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090534#comment-17090534
 ] 

Alan Woodward commented on SOLR-14428:
--

Hi [~cjcowie], thanks for opening this and for the thorough investigation!  I'm 
afraid I don't really know much about the Solr query caches choose which 
queries to cache - [~ab] might have a better idea of how to fix this?

> FuzzyQuery has severe memory usage in 8.5
> -
>
> Key: SOLR-14428
> URL: https://issues.apache.org/jira/browse/SOLR-14428
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 8.5, 8.5.1
>Reporter: Colvin Cowie
>Priority: Major
> Attachments: FuzzyHammer.java, image-2020-04-23-09-18-06-070.png, 
> screenshot-2.png, screenshot-3.png, screenshot-4.png
>
>
> I sent this to the mailing list
> I'm moving from 8.3.1 to 8.5.1, and started getting Out Of Memory Errors 
> while running our normal tests. After profiling it was clear that the 
> majority of the heap was allocated through FuzzyQuery.
> LUCENE-9068 moved construction of the automata from the FuzzyTermsEnum to the 
> FuzzyQuery's constructor.
> I created a little test ( [^FuzzyHammer.java] ) that fires off fuzzy queries 
> from random UUID strings for 5 minutes
> {code}
> FIELD_NAME + ":" + UUID.randomUUID().toString().replace("-", "") + "~2"
> {code}
> When running against a vanilla Solr 8.31 and 8.4.1 there is no problem, while 
> the memory usage has increased drastically on 8.5.0 and 8.5.1.
> Comparison of heap usage while running the attached test against Solr 8.3.1 
> and 8.5.1 with a single (empty) shard and 4GB heap:
> !image-2020-04-23-09-18-06-070.png! 
> And with 4 shards on 8.4.1 and 8.5.0:
>  !screenshot-2.png! 
> I'm guessing that the memory might be being leaked if the FuzzyQuery objects 
> are referenced from the cache, while the FuzzyTermsEnum would not have been.
> Query Result Cache on 8.5.1:
>  !screenshot-3.png! 
> ~316mb in the cache
> QRC on 8.3.1
>  !screenshot-4.png! 
> <1mb
> With an empty cache, running this query 
> _field_s:e41848af85d24ac197c71db6888e17bc~2_ results in the following memory 
> allocation
> {noformat}
> 8.3.1: CACHE.searcher.queryResultCache.ramBytesUsed:  1520
> 8.5.1: CACHE.searcher.queryResultCache.ramBytesUsed:648855
> {noformat}
> ~1 gives 98253 and ~0 gives 6339 on 8.5.1. 8.3.1 is constant at 1520



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14429) Convert XXX.txt files to proper XXX.md

2020-04-23 Thread Tomoko Uchida (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090523#comment-17090523
 ] 

Tomoko Uchida commented on SOLR-14429:
--

bq. Is Lucene excluded here?

Yes. I have modified only files under {{solr/}} folder here, to create a 
CHANGES entry for Solr. For Lucene LUCENE-9344 (and a PR) has been opened. 

> Convert XXX.txt files to proper XXX.md
> --
>
> Key: SOLR-14429
> URL: https://issues.apache.org/jira/browse/SOLR-14429
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tomoko Uchida
>Assignee: Tomoko Uchida
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> "README.txt" files are (partially) written in markdown and can be converted 
> to proper markdown files. This change was suggested on LUCENE-9321.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14428) FuzzyQuery has severe memory usage in 8.5

2020-04-23 Thread Colvin Cowie (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colvin Cowie updated SOLR-14428:

Description: 
I sent this to the mailing list

I'm moving from 8.3.1 to 8.5.1, and started getting Out Of Memory Errors while 
running our normal tests. After profiling it was clear that the majority of the 
heap was allocated through FuzzyQuery.
LUCENE-9068 moved construction of the automata from the FuzzyTermsEnum to the 
FuzzyQuery's constructor.

I created a little test ( [^FuzzyHammer.java] ) that fires off fuzzy queries 
from random UUID strings for 5 minutes

{code}
FIELD_NAME + ":" + UUID.randomUUID().toString().replace("-", "") + "~2"
{code}

When running against a vanilla Solr 8.31 and 8.4.1 there is no problem, while 
the memory usage has increased drastically on 8.5.0 and 8.5.1.

Comparison of heap usage while running the attached test against Solr 8.3.1 and 
8.5.1 with a single (empty) shard and 4GB heap:
!image-2020-04-23-09-18-06-070.png! 
And with 4 shards on 8.4.1 and 8.5.0:
 !screenshot-2.png! 

I'm guessing that the memory might be being leaked if the FuzzyQuery objects 
are referenced from the cache, while the FuzzyTermsEnum would not have been.

Query Result Cache on 8.5.1:
 !screenshot-3.png! 
~316mb in the cache

QRC on 8.3.1
 !screenshot-4.png! 
<1mb

With an empty cache, running this query 
_field_s:e41848af85d24ac197c71db6888e17bc~2_ results in the following memory 
allocation

{noformat}
8.3.1: CACHE.searcher.queryResultCache.ramBytesUsed:  1520
8.5.1: CACHE.searcher.queryResultCache.ramBytesUsed:648855
{noformat}
~1 gives 98253 and ~0 gives 6339 on 8.5.1. 8.3.1 is constant at 1520



  was:
I sent this to the mailing list

I'm moving from 8.3.1 to 8.5.1, and started getting Out Of Memory Errors while 
running our normal tests. After profiling it was clear that the majority of the 
heap was allocated through FuzzyQuery.
LUCENE-9068 moved construction of the automata from the FuzzyTermsEnum to the 
FuzzyQuery's constructor.

I created a little test ( [^FuzzyHammer.java] ) that fires off fuzzy queries 
from random UUID strings for 5 minutes

{code}
FIELD_NAME + ":" + UUID.randomUUID().toString().replace("-", "") + "~2"
{code}

When running against a vanilla Solr 8.31 and 8.4.1 there is no problem, while 
the memory usage has increased drastically on 8.5.0 and 8.5.1.

Comparison of heap usage while running the attached test against Solr 8.3.1 and 
8.5.1 with a single (empty) shard and 4GB heap:
!image-2020-04-23-09-18-06-070.png! 
And with 4 shards on 8.4.1 and 8.5.0:
 !screenshot-2.png! 

I'm guessing that the memory might be being leaked if the FuzzyQuery objects 
are referenced from the cache, while the FuzzyTermsEnum would not have been.

Query Result Cache on 8.5.1:
 !screenshot-3.png! 
~316mb in the cache

QRC on 8.3.1
 !screenshot-4.png! 
<1mb

With an empty cache, running this query 
_field_s:e41848af85d24ac197c71db6888e17bc~2_ results in the following memory 
allocation

{noformat}
8.3.1: CACHE.searcher.queryResultCache.ramBytesUsed:  1520
8.5.1: CACHE.searcher.queryResultCache.ramBytesUsed:648855
{noformat}





> FuzzyQuery has severe memory usage in 8.5
> -
>
> Key: SOLR-14428
> URL: https://issues.apache.org/jira/browse/SOLR-14428
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 8.5, 8.5.1
>Reporter: Colvin Cowie
>Priority: Major
> Attachments: FuzzyHammer.java, image-2020-04-23-09-18-06-070.png, 
> screenshot-2.png, screenshot-3.png, screenshot-4.png
>
>
> I sent this to the mailing list
> I'm moving from 8.3.1 to 8.5.1, and started getting Out Of Memory Errors 
> while running our normal tests. After profiling it was clear that the 
> majority of the heap was allocated through FuzzyQuery.
> LUCENE-9068 moved construction of the automata from the FuzzyTermsEnum to the 
> FuzzyQuery's constructor.
> I created a little test ( [^FuzzyHammer.java] ) that fires off fuzzy queries 
> from random UUID strings for 5 minutes
> {code}
> FIELD_NAME + ":" + UUID.randomUUID().toString().replace("-", "") + "~2"
> {code}
> When running against a vanilla Solr 8.31 and 8.4.1 there is no problem, while 
> the memory usage has increased drastically on 8.5.0 and 8.5.1.
> Comparison of heap usage while running the attached test against Solr 8.3.1 
> and 8.5.1 with a single (empty) shard and 4GB heap:
> !image-2020-04-23-09-18-06-070.png! 
> And with 4 shards on 8.4.1 and 8.5.0:
>  !screenshot-2.png! 
> I'm guessing that the memory might be being leaked if the FuzzyQuery objects 
> are referenced from the cache, while the FuzzyTermsEnum would not have been.
> Query Result Cache on 8.5.1:
>  !screenshot-3.png! 
> ~316mb in the cache
> QRC on 8.3.1
>  !screenshot-4.png! 
> <1mb

[jira] [Commented] (SOLR-12845) Add a default cluster policy

2020-04-23 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-12845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090488#comment-17090488
 ] 

ASF subversion and git services commented on SOLR-12845:


Commit 789c97be5fb66b61210cff9dafb89daabec9fe39 in lucene-solr's branch 
refs/heads/branch_8x from Andrzej Bialecki
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=789c97b ]

SOLR-12845: Properly clear default policy between tests.


> Add a default cluster policy
> 
>
> Key: SOLR-12845
> URL: https://issues.apache.org/jira/browse/SOLR-12845
> Project: Solr
>  Issue Type: Improvement
>  Components: AutoScaling
>Reporter: Shalin Shekhar Mangar
>Assignee: Andrzej Bialecki
>Priority: Major
> Fix For: 8.6
>
> Attachments: SOLR-12845.patch, SOLR-12845.patch
>
>
> [~varunthacker] commented on SOLR-12739:
> bq. We should also ship with some default policies - "Don't allow more than 
> one replica of a shard on the same JVM" , "Distribute cores across the 
> cluster evenly" , "Distribute replicas per collection across the nodes"
> This issue is about adding these defaults. I propose the following as default 
> cluster policy:
> {code}
> # Each shard cannot have more than one replica on the same node if possible
> {"replica": "<2", "shard": "#EACH", "node": "#ANY", "strict":false}
> # Each collections replicas should be equally distributed amongst nodes
> {"replica": "#EQUAL", "node": "#ANY", "strict":false} 
> # All cores should be equally distributed amongst nodes
> {"cores": "#EQUAL", "node": "#ANY", "strict":false}
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14429) Convert XXX.txt files to proper XXX.md

2020-04-23 Thread Uwe Schindler (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090489#comment-17090489
 ] 

Uwe Schindler commented on SOLR-14429:
--

Is Lucene excluded here?

I was not aware that the site docs were already fixed to have mdtext as 
extension. I have no preference on it, but md seems more common than mdtext.

> Convert XXX.txt files to proper XXX.md
> --
>
> Key: SOLR-14429
> URL: https://issues.apache.org/jira/browse/SOLR-14429
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tomoko Uchida
>Assignee: Tomoko Uchida
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> "README.txt" files are (partially) written in markdown and can be converted 
> to proper markdown files. This change was suggested on LUCENE-9321.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-12845) Add a default cluster policy

2020-04-23 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-12845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090486#comment-17090486
 ] 

ASF subversion and git services commented on SOLR-12845:


Commit 2a7ba5a48e065a5bb064a9c62562e73a0c3fb62e in lucene-solr's branch 
refs/heads/master from Andrzej Bialecki
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=2a7ba5a ]

SOLR-12845: Properly clear default policy between tests.


> Add a default cluster policy
> 
>
> Key: SOLR-12845
> URL: https://issues.apache.org/jira/browse/SOLR-12845
> Project: Solr
>  Issue Type: Improvement
>  Components: AutoScaling
>Reporter: Shalin Shekhar Mangar
>Assignee: Andrzej Bialecki
>Priority: Major
> Fix For: 8.6
>
> Attachments: SOLR-12845.patch, SOLR-12845.patch
>
>
> [~varunthacker] commented on SOLR-12739:
> bq. We should also ship with some default policies - "Don't allow more than 
> one replica of a shard on the same JVM" , "Distribute cores across the 
> cluster evenly" , "Distribute replicas per collection across the nodes"
> This issue is about adding these defaults. I propose the following as default 
> cluster policy:
> {code}
> # Each shard cannot have more than one replica on the same node if possible
> {"replica": "<2", "shard": "#EACH", "node": "#ANY", "strict":false}
> # Each collections replicas should be equally distributed amongst nodes
> {"replica": "#EQUAL", "node": "#ANY", "strict":false} 
> # All cores should be equally distributed amongst nodes
> {"cores": "#EQUAL", "node": "#ANY", "strict":false}
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] romseygeek commented on issue #1440: LUCENE-9330: Make SortFields responsible for index sorting and serialization

2020-04-23 Thread GitBox



romseygeek commented on issue #1440:
URL: https://github.com/apache/lucene-solr/pull/1440#issuecomment-618322909


   > Should we remove write support from Lucene70SegmentInfoFormat and have a 
RW version under
   test-framework like we do for other components, so that users can't use it 
in their codecs but we can still run the segment info format test case?
   
   This is trickier for the segment info format because both reading and 
writing are handled by the same class.  I think so far we've only done this for 
Postings formats, and not for other parts of the Codec?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14428) FuzzyQuery has severe memory usage in 8.5

2020-04-23 Thread Colvin Cowie (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090482#comment-17090482
 ] 

Colvin Cowie commented on SOLR-14428:
-

Hi [~romseygeek], what are your thoughts on this? Thanks

> FuzzyQuery has severe memory usage in 8.5
> -
>
> Key: SOLR-14428
> URL: https://issues.apache.org/jira/browse/SOLR-14428
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 8.5, 8.5.1
>Reporter: Colvin Cowie
>Priority: Major
> Attachments: FuzzyHammer.java, image-2020-04-23-09-18-06-070.png, 
> screenshot-2.png, screenshot-3.png, screenshot-4.png
>
>
> I sent this to the mailing list
> I'm moving from 8.3.1 to 8.5.1, and started getting Out Of Memory Errors 
> while running our normal tests. After profiling it was clear that the 
> majority of the heap was allocated through FuzzyQuery.
> LUCENE-9068 moved construction of the automata from the FuzzyTermsEnum to the 
> FuzzyQuery's constructor.
> I created a little test ( [^FuzzyHammer.java] ) that fires off fuzzy queries 
> from random UUID strings for 5 minutes
> {code}
> FIELD_NAME + ":" + UUID.randomUUID().toString().replace("-", "") + "~2"
> {code}
> When running against a vanilla Solr 8.31 and 8.4.1 there is no problem, while 
> the memory usage has increased drastically on 8.5.0 and 8.5.1.
> Comparison of heap usage while running the attached test against Solr 8.3.1 
> and 8.5.1 with a single (empty) shard and 4GB heap:
> !image-2020-04-23-09-18-06-070.png! 
> And with 4 shards on 8.4.1 and 8.5.0:
>  !screenshot-2.png! 
> I'm guessing that the memory might be being leaked if the FuzzyQuery objects 
> are referenced from the cache, while the FuzzyTermsEnum would not have been.
> Query Result Cache on 8.5.1:
>  !screenshot-3.png! 
> ~316mb in the cache
> QRC on 8.3.1
>  !screenshot-4.png! 
> <1mb
> With an empty cache, running this query 
> _field_s:e41848af85d24ac197c71db6888e17bc~2_ results in the following memory 
> allocation
> {noformat}
> 8.3.1: CACHE.searcher.queryResultCache.ramBytesUsed:  1520
> 8.5.1: CACHE.searcher.queryResultCache.ramBytesUsed:648855
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14428) FuzzyQuery has severe memory usage in 8.5

2020-04-23 Thread Colvin Cowie (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colvin Cowie updated SOLR-14428:

Description: 
I sent this to the mailing list

I'm moving from 8.3.1 to 8.5.1, and started getting Out Of Memory Errors while 
running our normal tests. After profiling it was clear that the majority of the 
heap was allocated through FuzzyQuery.
LUCENE-9068 moved construction of the automata from the FuzzyTermsEnum to the 
FuzzyQuery's constructor.

I created a little test ( [^FuzzyHammer.java] ) that fires off fuzzy queries 
from random UUID strings for 5 minutes

{code}
FIELD_NAME + ":" + UUID.randomUUID().toString().replace("-", "") + "~2"
{code}

When running against a vanilla Solr 8.31 and 8.4.1 there is no problem, while 
the memory usage has increased drastically on 8.5.0 and 8.5.1.

Comparison of heap usage while running the attached test against Solr 8.3.1 and 
8.5.1 with a single (empty) shard and 4GB heap:
!image-2020-04-23-09-18-06-070.png! 
And with 4 shards on 8.4.1 and 8.5.0:
 !screenshot-2.png! 

I'm guessing that the memory might be being leaked if the FuzzyQuery objects 
are referenced from the cache, while the FuzzyTermsEnum would not have been.

Query Result Cache on 8.5.1:
 !screenshot-3.png! 
~316mb in the cache

QRC on 8.3.1
 !screenshot-4.png! 
<1mb

With an empty cache, running this query 
_field_s:e41848af85d24ac197c71db6888e17bc~2_ results in the following memory 
allocation

{noformat}
8.3.1: CACHE.searcher.queryResultCache.ramBytesUsed: 1520
8.5.1: CACHE.searcher.queryResultCache.ramBytesUsed:648855
{noformat}




  was:
I sent this to the mailing list

I'm moving from 8.3.1 to 8.5.1, and started getting Out Of Memory Errors while 
running our normal tests. After profiling it was clear that the majority of the 
heap was allocated through FuzzyQuery.
LUCENE-9068 moved construction of the automata from the FuzzyTermsEnum to the 
FuzzyQuery's constructor.

I created a little test ( [^FuzzyHammer.java] ) that fires off fuzzy queries 
from random UUID strings for 5 minutes

{code}
FIELD_NAME + ":" + UUID.randomUUID().toString().replace("-", "") + "~2"
{code}

When running against a vanilla Solr 8.31 and 8.4.1 there is no problem, while 
the memory usage has increased drastically on 8.5.0 and 8.5.1.

Comparison of heap usage while running the attached test against Solr 8.3.1 and 
8.5.1 with a single (empty) shard and 4GB heap:
!image-2020-04-23-09-18-06-070.png! 
And with 4 shards on 8.4.1 and 8.5.0:
 !screenshot-2.png! 

I'm guessing that the memory might be being leaked if the FuzzyQuery objects 
are referenced from the cache, while the FuzzyTermsEnum would not have been.

Query Result Cache on 8.5.1:
 !screenshot-3.png! 
~316mb in the cache

QRC on 8.3.1
 !screenshot-4.png! 
<1mb

With an empty cache, running this query 
_field_s:e41848af85d24ac197c71db6888e17bc~2_ results in the following memory 
allocation

{noformat}
8.3.1: CACHE.searcher.queryResultCache.ramBytesUsed:1520
8.5.1: CACHE.searcher.queryResultCache.ramBytesUsed:648855
{noformat}





> FuzzyQuery has severe memory usage in 8.5
> -
>
> Key: SOLR-14428
> URL: https://issues.apache.org/jira/browse/SOLR-14428
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 8.5, 8.5.1
>Reporter: Colvin Cowie
>Priority: Major
> Attachments: FuzzyHammer.java, image-2020-04-23-09-18-06-070.png, 
> screenshot-2.png, screenshot-3.png, screenshot-4.png
>
>
> I sent this to the mailing list
> I'm moving from 8.3.1 to 8.5.1, and started getting Out Of Memory Errors 
> while running our normal tests. After profiling it was clear that the 
> majority of the heap was allocated through FuzzyQuery.
> LUCENE-9068 moved construction of the automata from the FuzzyTermsEnum to the 
> FuzzyQuery's constructor.
> I created a little test ( [^FuzzyHammer.java] ) that fires off fuzzy queries 
> from random UUID strings for 5 minutes
> {code}
> FIELD_NAME + ":" + UUID.randomUUID().toString().replace("-", "") + "~2"
> {code}
> When running against a vanilla Solr 8.31 and 8.4.1 there is no problem, while 
> the memory usage has increased drastically on 8.5.0 and 8.5.1.
> Comparison of heap usage while running the attached test against Solr 8.3.1 
> and 8.5.1 with a single (empty) shard and 4GB heap:
> !image-2020-04-23-09-18-06-070.png! 
> And with 4 shards on 8.4.1 and 8.5.0:
>  !screenshot-2.png! 
> I'm guessing that the memory might be being leaked if the FuzzyQuery objects 
> are referenced from the cache, while the FuzzyTermsEnum would not have been.
> Query Result Cache on 8.5.1:
>  !screenshot-3.png! 
> ~316mb in the cache
> QRC on 8.3.1
>  !screenshot-4.png! 
> <1mb
> With an empty cache, running this query 
>

[jira] [Updated] (SOLR-14428) FuzzyQuery has severe memory usage in 8.5

2020-04-23 Thread Colvin Cowie (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colvin Cowie updated SOLR-14428:

Description: 
I sent this to the mailing list

I'm moving from 8.3.1 to 8.5.1, and started getting Out Of Memory Errors while 
running our normal tests. After profiling it was clear that the majority of the 
heap was allocated through FuzzyQuery.
LUCENE-9068 moved construction of the automata from the FuzzyTermsEnum to the 
FuzzyQuery's constructor.

I created a little test ( [^FuzzyHammer.java] ) that fires off fuzzy queries 
from random UUID strings for 5 minutes

{code}
FIELD_NAME + ":" + UUID.randomUUID().toString().replace("-", "") + "~2"
{code}

When running against a vanilla Solr 8.31 and 8.4.1 there is no problem, while 
the memory usage has increased drastically on 8.5.0 and 8.5.1.

Comparison of heap usage while running the attached test against Solr 8.3.1 and 
8.5.1 with a single (empty) shard and 4GB heap:
!image-2020-04-23-09-18-06-070.png! 
And with 4 shards on 8.4.1 and 8.5.0:
 !screenshot-2.png! 

I'm guessing that the memory might be being leaked if the FuzzyQuery objects 
are referenced from the cache, while the FuzzyTermsEnum would not have been.

Query Result Cache on 8.5.1:
 !screenshot-3.png! 
~316mb in the cache

QRC on 8.3.1
 !screenshot-4.png! 
<1mb

With an empty cache, running this query 
_field_s:e41848af85d24ac197c71db6888e17bc~2_ results in the following memory 
allocation

{noformat}
8.3.1: CACHE.searcher.queryResultCache.ramBytesUsed:1520
8.5.1: CACHE.searcher.queryResultCache.ramBytesUsed:648855
{noformat}




  was:
I sent this to the mailing list

I'm moving from 8.3.1 to 8.5.1, and started getting Out Of Memory Errors while 
running our normal tests. After profiling it was clear that the majority of the 
heap was allocated through FuzzyQuery.
LUCENE-9068 moved construction of the automata from the FuzzyTermsEnum to the 
FuzzyQuery's constructor.

I created a little test ( [^FuzzyHammer.java] ) that fires off fuzzy queries 
from random UUID strings for 5 minutes

{code}
FIELD_NAME + ":" + UUID.randomUUID().toString().replace("-", "") + "~2"
{code}

When running against a vanilla Solr 8.31 and 8.4.1 there is no problem, while 
the memory usage has increased drastically on 8.5.0 and 8.5.1.

Comparison of heap usage while running the attached test against Solr 8.3.1 and 
8.5.1 with a single (empty) shard and 4GB heap:
!image-2020-04-23-09-18-06-070.png! 
And with 4 shards on 8.4.1 and 8.5.0:
 !screenshot-2.png! 

I'm guessing that the memory might be being leaked if the FuzzyQuery objects 
are referenced from the cache, while the FuzzyTermsEnum would not have been.

Query Result Cache on 8.5.1:
 !screenshot-3.png! 
~316mb in the cache

QRC on 8.3.1
 !screenshot-4.png! 
<1mb


> FuzzyQuery has severe memory usage in 8.5
> -
>
> Key: SOLR-14428
> URL: https://issues.apache.org/jira/browse/SOLR-14428
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 8.5, 8.5.1
>Reporter: Colvin Cowie
>Priority: Major
> Attachments: FuzzyHammer.java, image-2020-04-23-09-18-06-070.png, 
> screenshot-2.png, screenshot-3.png, screenshot-4.png
>
>
> I sent this to the mailing list
> I'm moving from 8.3.1 to 8.5.1, and started getting Out Of Memory Errors 
> while running our normal tests. After profiling it was clear that the 
> majority of the heap was allocated through FuzzyQuery.
> LUCENE-9068 moved construction of the automata from the FuzzyTermsEnum to the 
> FuzzyQuery's constructor.
> I created a little test ( [^FuzzyHammer.java] ) that fires off fuzzy queries 
> from random UUID strings for 5 minutes
> {code}
> FIELD_NAME + ":" + UUID.randomUUID().toString().replace("-", "") + "~2"
> {code}
> When running against a vanilla Solr 8.31 and 8.4.1 there is no problem, while 
> the memory usage has increased drastically on 8.5.0 and 8.5.1.
> Comparison of heap usage while running the attached test against Solr 8.3.1 
> and 8.5.1 with a single (empty) shard and 4GB heap:
> !image-2020-04-23-09-18-06-070.png! 
> And with 4 shards on 8.4.1 and 8.5.0:
>  !screenshot-2.png! 
> I'm guessing that the memory might be being leaked if the FuzzyQuery objects 
> are referenced from the cache, while the FuzzyTermsEnum would not have been.
> Query Result Cache on 8.5.1:
>  !screenshot-3.png! 
> ~316mb in the cache
> QRC on 8.3.1
>  !screenshot-4.png! 
> <1mb
> With an empty cache, running this query 
> _field_s:e41848af85d24ac197c71db6888e17bc~2_ results in the following memory 
> allocation
> {noformat}
> 8.3.1: CACHE.searcher.queryResultCache.ramBytesUsed:1520
> 8.5.1: CACHE.searcher.queryResultCache.ramBytesUsed:648855
> {noformat}



--
This message was sent by Atlassian Jira

[jira] [Updated] (SOLR-14428) FuzzyQuery has severe memory usage in 8.5

2020-04-23 Thread Colvin Cowie (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colvin Cowie updated SOLR-14428:

Description: 
I sent this to the mailing list

I'm moving from 8.3.1 to 8.5.1, and started getting Out Of Memory Errors while 
running our normal tests. After profiling it was clear that the majority of the 
heap was allocated through FuzzyQuery.
LUCENE-9068 moved construction of the automata from the FuzzyTermsEnum to the 
FuzzyQuery's constructor.

I created a little test ( [^FuzzyHammer.java] ) that fires off fuzzy queries 
from random UUID strings for 5 minutes

{code}
FIELD_NAME + ":" + UUID.randomUUID().toString().replace("-", "") + "~2"
{code}

When running against a vanilla Solr 8.31 and 8.4.1 there is no problem, while 
the memory usage has increased drastically on 8.5.0 and 8.5.1.

Comparison of heap usage while running the attached test against Solr 8.3.1 and 
8.5.1 with a single (empty) shard and 4GB heap:
!image-2020-04-23-09-18-06-070.png! 
And with 4 shards on 8.4.1 and 8.5.0:
 !screenshot-2.png! 

I'm guessing that the memory might be being leaked if the FuzzyQuery objects 
are referenced from the cache, while the FuzzyTermsEnum would not have been.

Query Result Cache on 8.5.1:
 !screenshot-3.png! 
~316mb in the cache

QRC on 8.3.1
 !screenshot-4.png! 
<1mb

With an empty cache, running this query 
_field_s:e41848af85d24ac197c71db6888e17bc~2_ results in the following memory 
allocation

{noformat}
8.3.1: CACHE.searcher.queryResultCache.ramBytesUsed:  1520
8.5.1: CACHE.searcher.queryResultCache.ramBytesUsed:648855
{noformat}




  was:
I sent this to the mailing list

I'm moving from 8.3.1 to 8.5.1, and started getting Out Of Memory Errors while 
running our normal tests. After profiling it was clear that the majority of the 
heap was allocated through FuzzyQuery.
LUCENE-9068 moved construction of the automata from the FuzzyTermsEnum to the 
FuzzyQuery's constructor.

I created a little test ( [^FuzzyHammer.java] ) that fires off fuzzy queries 
from random UUID strings for 5 minutes

{code}
FIELD_NAME + ":" + UUID.randomUUID().toString().replace("-", "") + "~2"
{code}

When running against a vanilla Solr 8.31 and 8.4.1 there is no problem, while 
the memory usage has increased drastically on 8.5.0 and 8.5.1.

Comparison of heap usage while running the attached test against Solr 8.3.1 and 
8.5.1 with a single (empty) shard and 4GB heap:
!image-2020-04-23-09-18-06-070.png! 
And with 4 shards on 8.4.1 and 8.5.0:
 !screenshot-2.png! 

I'm guessing that the memory might be being leaked if the FuzzyQuery objects 
are referenced from the cache, while the FuzzyTermsEnum would not have been.

Query Result Cache on 8.5.1:
 !screenshot-3.png! 
~316mb in the cache

QRC on 8.3.1
 !screenshot-4.png! 
<1mb

With an empty cache, running this query 
_field_s:e41848af85d24ac197c71db6888e17bc~2_ results in the following memory 
allocation

{noformat}
8.3.1: CACHE.searcher.queryResultCache.ramBytesUsed: 1520
8.5.1: CACHE.searcher.queryResultCache.ramBytesUsed:648855
{noformat}





> FuzzyQuery has severe memory usage in 8.5
> -
>
> Key: SOLR-14428
> URL: https://issues.apache.org/jira/browse/SOLR-14428
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 8.5, 8.5.1
>Reporter: Colvin Cowie
>Priority: Major
> Attachments: FuzzyHammer.java, image-2020-04-23-09-18-06-070.png, 
> screenshot-2.png, screenshot-3.png, screenshot-4.png
>
>
> I sent this to the mailing list
> I'm moving from 8.3.1 to 8.5.1, and started getting Out Of Memory Errors 
> while running our normal tests. After profiling it was clear that the 
> majority of the heap was allocated through FuzzyQuery.
> LUCENE-9068 moved construction of the automata from the FuzzyTermsEnum to the 
> FuzzyQuery's constructor.
> I created a little test ( [^FuzzyHammer.java] ) that fires off fuzzy queries 
> from random UUID strings for 5 minutes
> {code}
> FIELD_NAME + ":" + UUID.randomUUID().toString().replace("-", "") + "~2"
> {code}
> When running against a vanilla Solr 8.31 and 8.4.1 there is no problem, while 
> the memory usage has increased drastically on 8.5.0 and 8.5.1.
> Comparison of heap usage while running the attached test against Solr 8.3.1 
> and 8.5.1 with a single (empty) shard and 4GB heap:
> !image-2020-04-23-09-18-06-070.png! 
> And with 4 shards on 8.4.1 and 8.5.0:
>  !screenshot-2.png! 
> I'm guessing that the memory might be being leaked if the FuzzyQuery objects 
> are referenced from the cache, while the FuzzyTermsEnum would not have been.
> Query Result Cache on 8.5.1:
>  !screenshot-3.png! 
> ~316mb in the cache
> QRC on 8.3.1
>  !screenshot-4.png! 
> <1mb
> With an empty cache, running this query 
>

[GitHub] [lucene-solr] s1monw opened a new pull request #1451: LUCENE-9345: Separate MergeSchedulder from IndexWriter

2020-04-23 Thread GitBox



s1monw opened a new pull request #1451:
URL: https://github.com/apache/lucene-solr/pull/1451


   This change extracts the methods that are used by MergeScheduler into
   a MergeSource interface. This allows IndexWriter to better ensure
   locking, hide internal methods and removes the tight coupling between the two
   complex classes. This will also improve future testing.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (LUCENE-9345) Separate IndexWriter from MergeScheduler

2020-04-23 Thread Simon Willnauer (Jira)

Simon Willnauer created LUCENE-9345:
---

 Summary: Separate IndexWriter from MergeScheduler
 Key: LUCENE-9345
 URL: https://issues.apache.org/jira/browse/LUCENE-9345
 Project: Lucene - Core
  Issue Type: Improvement
Affects Versions: master (9.0)
Reporter: Simon Willnauer


MergeScheduler is tightly coupled with IndexWriter which causes IW to expose 
unnecessary methods. For instance only the scheduler should call 
IW#getNextMerge() but it's a public method. With some refactorings we can 
nicely separate the two. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] romseygeek commented on a change in pull request #1440: LUCENE-9330: Make SortFields responsible for index sorting and serialization

2020-04-23 Thread GitBox



romseygeek commented on a change in pull request #1440:
URL: https://github.com/apache/lucene-solr/pull/1440#discussion_r413692946



##
File path: 
lucene/core/src/java/org/apache/lucene/index/SortedDocValuesWriter.java
##
@@ -79,11 +78,6 @@ public void addValue(int docID, BytesRef value) {
 lastDocID = docID;
   }
 
-  @Override
-  public void finish(int maxDoc) {
-updateBytesUsed();
-  }

Review comment:
   It was always called either immediately before `flush` or 
`getDocComparator`, so it seemed to make sense to just fold it directly into 
those methods.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] romseygeek commented on a change in pull request #1440: LUCENE-9330: Make SortFields responsible for index sorting and serialization

2020-04-23 Thread GitBox



romseygeek commented on a change in pull request #1440:
URL: https://github.com/apache/lucene-solr/pull/1440#discussion_r413689133



##
File path: 
lucene/core/src/java/org/apache/lucene/index/DefaultIndexingChain.java
##
@@ -527,45 +589,63 @@ private void indexPoint(PerField fp, IndexableField 
field) throws IOException {
 fp.pointValuesWriter.addPackedValue(docState.docID, field.binaryValue());
   }
 
-  private void validateIndexSortDVType(Sort indexSort, String fieldName, 
DocValuesType dvType) {
+  private void validateIndexSortDVType(Sort indexSort, String fieldToValidate, 
DocValuesType dvType) throws IOException {
 for (SortField sortField : indexSort.getSort()) {
-  if (sortField.getField().equals(fieldName)) {
-switch (dvType) {
-  case NUMERIC:
-if (sortField.getType().equals(SortField.Type.INT) == false &&
-  sortField.getType().equals(SortField.Type.LONG) == false &&
-  sortField.getType().equals(SortField.Type.FLOAT) == false &&
-  sortField.getType().equals(SortField.Type.DOUBLE) == false) {
-  throw new IllegalArgumentException("invalid doc value type:" + 
dvType + " for sortField:" + sortField);
-}
-break;
+  IndexSorter sorter = sortField.getIndexSorter();
+  if (sorter == null) {
+throw new IllegalStateException("Cannot sort index with sort order " + 
sortField);
+  }
+  sorter.getDocComparator(new DocValuesLeafReader() {
+@Override
+public NumericDocValues getNumericDocValues(String field) {
+  if (Objects.equals(field, fieldToValidate) && dvType != 
DocValuesType.NUMERIC) {
+throw new IllegalArgumentException("SortField " + sortField + " 
expected field [" + field + "] to be NUMERIC but it is [" + dvType + "]");
+  }
+  return DocValues.emptyNumeric();
+}
 
-  case BINARY:
-throw new IllegalArgumentException("invalid doc value type:" + 
dvType + " for sortField:" + sortField);
+@Override
+public BinaryDocValues getBinaryDocValues(String field) {
+  if (Objects.equals(field, fieldToValidate) && dvType != 
DocValuesType.BINARY) {
+throw new IllegalArgumentException("SortField " + sortField + " 
expected field [" + field + "] to be BINARY but it is [" + dvType + "]");
+  }
+  return DocValues.emptyBinary();
+}
 
-  case SORTED:
-if (sortField.getType().equals(SortField.Type.STRING) == false) {
-  throw new IllegalArgumentException("invalid doc value type:" + 
dvType + " for sortField:" + sortField);
-}
-break;
+@Override
+public SortedDocValues getSortedDocValues(String field) {
+  if (Objects.equals(field, fieldToValidate) && dvType != 
DocValuesType.SORTED) {
+throw new IllegalArgumentException("SortField " + sortField + " 
expected field [" + field + "] to be SORTED but it is [" + dvType + "]");
+  }
+  return DocValues.emptySorted();
+}
 
-  case SORTED_NUMERIC:
-if (sortField instanceof SortedNumericSortField == false) {
-  throw new IllegalArgumentException("invalid doc value type:" + 
dvType + " for sortField:" + sortField);
-}
-break;
+@Override
+public SortedNumericDocValues getSortedNumericDocValues(String field) {
+  if (Objects.equals(field, fieldToValidate) && dvType != 
DocValuesType.SORTED_NUMERIC) {
+throw new IllegalArgumentException("SortField " + sortField + " 
expected field [" + field + "] to be SORTED_NUMERIC but it is [" + dvType + 
"]");
+  }
+  return DocValues.emptySortedNumeric(0);
+}
 
-  case SORTED_SET:
-if (sortField instanceof SortedSetSortField == false) {
-  throw new IllegalArgumentException("invalid doc value type:" + 
dvType + " for sortField:" + sortField);
-}
-break;
+@Override
+public SortedSetDocValues getSortedSetDocValues(String field) {
+  if (Objects.equals(field, fieldToValidate) && dvType != 
DocValuesType.SORTED_SET) {
+throw new IllegalArgumentException("SortField " + sortField + " 
expected field [" + field + "] to be SORTED_SET but it is [" + dvType + "]");
+  }
+  return DocValues.emptySortedSet();
+}
 
-  default:
-throw new IllegalArgumentException("invalid doc value type:" + 
dvType + " for sortField:" + sortField);
+@Override
+public FieldInfos getFieldInfos() {
+  throw new UnsupportedOperationException();
 }
-break;
-  }
+
+@Override
+public int maxDoc() {
+  return 0;

Review comment:
   `IndexSorter.getDocComparator(Reader)` calls `maxDoc()` on the reader to 
allocate its comparison arrays.  We

[jira] [Commented] (LUCENE-9321) Port documentation task to gradle

2020-04-23 Thread Tomoko Uchida (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090455#comment-17090455
 ] 

Tomoko Uchida commented on LUCENE-9321:
---

I opened LUCENE-9344 and SOLR-14429 with draft patches that converts ".txt" 
files to ".md". 
Note: Solr has a lot "README.txt" files and the part of them are not actually 
markdown but pure text; I converted all of them to .md for consistency.

I would like to merge it before this issue (because the gradle task also refer 
the md files), would you review it please?

> Port documentation task to gradle
> -
>
> Key: LUCENE-9321
> URL: https://issues.apache.org/jira/browse/LUCENE-9321
> Project: Lucene - Core
>  Issue Type: Sub-task
>  Components: general/build
>Reporter: Tomoko Uchida
>Assignee: Tomoko Uchida
>Priority: Major
>
> This is a placeholder issue for porting ant "documentation" task to gradle. 
> The generated documents should be able to be published on lucene.apache.org 
> web site on "as-is" basis.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Comment Edited] (LUCENE-9344) Convert XXX.txt files to proper XXX.md

2020-04-23 Thread Tomoko Uchida (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090440#comment-17090440
 ] 

Tomoko Uchida edited comment on LUCENE-9344 at 4/23/20, 9:50 AM:
-

I opened [https://github.com/apache/lucene-solr/pull/1449]
 - Changed README.txt, MIGRATE.txt, etc. to .md and partially fix its markdown 
formatting.
 - LICENCE.txt and NOTICE.txt were not modified.

This also modifies build.xml so that distribution package built by {{ant 
package-tgz}} includes the all .md files. {{ant documentation}} also works fine.

TODO: run smoke test


was (Author: tomoko uchida):
I opened [https://github.com/apache/lucene-solr/pull/1449]
 - Changed README.txt, MIGRATE.txt, etc. to .md and partially fix its markdown 
formatting.
 - LICENCE.txt and NOTICE.txt were not modified.

This also modifies build.xml so that distribution package built by {{ant 
package-tgz}} includes the all .md files

TODO: run smoke test

> Convert  XXX.txt files to proper XXX.md
> ---
>
> Key: LUCENE-9344
> URL: https://issues.apache.org/jira/browse/LUCENE-9344
> Project: Lucene - Core
>  Issue Type: Improvement
>Affects Versions: master (9.0)
>Reporter: Tomoko Uchida
>Assignee: Tomoko Uchida
>Priority: Minor
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Text files that are (partially) written in markdown (such as "README.txt") 
> can be converted to proper markdown files. This change was suggested on 
> LUCENE-9321.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14429) Convert XXX.txt files to proper XXX.md

2020-04-23 Thread Tomoko Uchida (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090448#comment-17090448
 ] 

Tomoko Uchida commented on SOLR-14429:
--

I opened [https://github.com/apache/lucene-solr/pull/1450]
 * Converted all README.txt to README.md and partially fixed its formatting (as 
proper markdown). I also fixed pointers to the files.
 * LICENCE.txt and NOTICE.txt were not modified.

This also modifies build.xml so that distribution package built by {{ant 
create-package}} includes the all .md files

TODO: run smoke test

> Convert XXX.txt files to proper XXX.md
> --
>
> Key: SOLR-14429
> URL: https://issues.apache.org/jira/browse/SOLR-14429
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tomoko Uchida
>Assignee: Tomoko Uchida
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> "README.txt" files are (partially) written in markdown and can be converted 
> to proper markdown files. This change was suggested on LUCENE-9321.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] romseygeek commented on a change in pull request #1444: LUCENE-9338: Clean up type safety in SimpleBindings

2020-04-23 Thread GitBox



romseygeek commented on a change in pull request #1444:
URL: https://github.com/apache/lucene-solr/pull/1444#discussion_r413670977



##
File path: 
lucene/expressions/src/java/org/apache/lucene/expressions/ExpressionValueSource.java
##
@@ -42,13 +42,17 @@
 this.expression = Objects.requireNonNull(expression);
 variables = new DoubleValuesSource[expression.variables.length];
 boolean needsScores = false;
-for (int i = 0; i < variables.length; i++) {
-  DoubleValuesSource source = 
bindings.getDoubleValuesSource(expression.variables[i]);
-  if (source == null) {
-throw new RuntimeException("Internal error. Variable (" + 
expression.variables[i] + ") does not exist.");
+try {
+  for (int i = 0; i < variables.length; i++) {
+DoubleValuesSource source = 
bindings.getDoubleValuesSource(expression.variables[i]);
+if (source == null) {
+  throw new RuntimeException("Internal error. Variable (" + 
expression.variables[i] + ") does not exist.");
+}
+needsScores |= source.needsScores();
+variables[i] = source;
   }
-  needsScores |= source.needsScores();
-  variables[i] = source;
+} catch (StackOverflowError e) {

Review comment:
   I've reworked this so that instead of a `Supplier` 
we store a `Function`, and supply a special 
`Bindings` implementation in `validate()` that checks for cycles.  Definitely 
much nicer, thanks for nudging me in the right direction!





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9344) Convert XXX.txt files to proper XXX.md

2020-04-23 Thread Tomoko Uchida (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090440#comment-17090440
 ] 

Tomoko Uchida commented on LUCENE-9344:
---

I opened [https://github.com/apache/lucene-solr/pull/1449]
 - Changed README.txt, MIGRATE.txt, etc. to .md and partially fix its markdown 
formatting.
 - LICENCE.txt and NOTICE.txt were not modified.

This also modifies build.xml so that distribution package built by {{ant 
package-tgz}} includes the all .md files

TODO: run smoke test

> Convert  XXX.txt files to proper XXX.md
> ---
>
> Key: LUCENE-9344
> URL: https://issues.apache.org/jira/browse/LUCENE-9344
> Project: Lucene - Core
>  Issue Type: Improvement
>Affects Versions: master (9.0)
>Reporter: Tomoko Uchida
>Assignee: Tomoko Uchida
>Priority: Minor
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Text files that are (partially) written in markdown (such as "README.txt") 
> can be converted to proper markdown files. This change was suggested on 
> LUCENE-9321.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] jpountz commented on a change in pull request #1440: LUCENE-9330: Make SortFields responsible for index sorting and serialization

2020-04-23 Thread GitBox



jpountz commented on a change in pull request #1440:
URL: https://github.com/apache/lucene-solr/pull/1440#discussion_r413639432



##
File path: 
lucene/backward-codecs/src/java/org/apache/lucene/codecs/lucene70/package.html
##
@@ -0,0 +1,25 @@
+/*
+* Licensed to the Apache Software Foundation (ASF) under one or more
+* contributor license agreements.  See the NOTICE file distributed with
+* this work for additional information regarding copyright ownership.
+* The ASF licenses this file to You under the Apache License, Version 2.0
+* (the "License"); you may not use this file except in compliance with
+* the License.  You may obtain a copy of the License at
+*
+* http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing, software
+* distributed under the License is distributed on an "AS IS" BASIS,
+* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+* See the License for the specific language governing permissions and
+* limitations under the License.
+*/
+
+
+
+  
+
+
+Lucene 7.0 file format.
+
+

Review comment:
   you could use a package-info.java instead. We use the html version in 
backward-codecs because there are otherwise conflicts if the package also 
exists in core, but it looks like you removed the package from core here so we 
could use a package-info.java in backward-codecs?

##
File path: 
lucene/core/src/java/org/apache/lucene/index/DefaultIndexingChain.java
##
@@ -527,45 +589,63 @@ private void indexPoint(PerField fp, IndexableField 
field) throws IOException {
 fp.pointValuesWriter.addPackedValue(docState.docID, field.binaryValue());
   }
 
-  private void validateIndexSortDVType(Sort indexSort, String fieldName, 
DocValuesType dvType) {
+  private void validateIndexSortDVType(Sort indexSort, String fieldToValidate, 
DocValuesType dvType) throws IOException {
 for (SortField sortField : indexSort.getSort()) {
-  if (sortField.getField().equals(fieldName)) {
-switch (dvType) {
-  case NUMERIC:
-if (sortField.getType().equals(SortField.Type.INT) == false &&
-  sortField.getType().equals(SortField.Type.LONG) == false &&
-  sortField.getType().equals(SortField.Type.FLOAT) == false &&
-  sortField.getType().equals(SortField.Type.DOUBLE) == false) {
-  throw new IllegalArgumentException("invalid doc value type:" + 
dvType + " for sortField:" + sortField);
-}
-break;
+  IndexSorter sorter = sortField.getIndexSorter();
+  if (sorter == null) {
+throw new IllegalStateException("Cannot sort index with sort order " + 
sortField);
+  }
+  sorter.getDocComparator(new DocValuesLeafReader() {
+@Override
+public NumericDocValues getNumericDocValues(String field) {
+  if (Objects.equals(field, fieldToValidate) && dvType != 
DocValuesType.NUMERIC) {
+throw new IllegalArgumentException("SortField " + sortField + " 
expected field [" + field + "] to be NUMERIC but it is [" + dvType + "]");
+  }
+  return DocValues.emptyNumeric();
+}
 
-  case BINARY:
-throw new IllegalArgumentException("invalid doc value type:" + 
dvType + " for sortField:" + sortField);
+@Override
+public BinaryDocValues getBinaryDocValues(String field) {
+  if (Objects.equals(field, fieldToValidate) && dvType != 
DocValuesType.BINARY) {
+throw new IllegalArgumentException("SortField " + sortField + " 
expected field [" + field + "] to be BINARY but it is [" + dvType + "]");
+  }
+  return DocValues.emptyBinary();
+}
 
-  case SORTED:
-if (sortField.getType().equals(SortField.Type.STRING) == false) {
-  throw new IllegalArgumentException("invalid doc value type:" + 
dvType + " for sortField:" + sortField);
-}
-break;
+@Override
+public SortedDocValues getSortedDocValues(String field) {
+  if (Objects.equals(field, fieldToValidate) && dvType != 
DocValuesType.SORTED) {
+throw new IllegalArgumentException("SortField " + sortField + " 
expected field [" + field + "] to be SORTED but it is [" + dvType + "]");
+  }
+  return DocValues.emptySorted();
+}
 
-  case SORTED_NUMERIC:
-if (sortField instanceof SortedNumericSortField == false) {
-  throw new IllegalArgumentException("invalid doc value type:" + 
dvType + " for sortField:" + sortField);
-}
-break;
+@Override
+public SortedNumericDocValues getSortedNumericDocValues(String field) {
+  if (Objects.equals(field, fieldToValidate) && dvType != 
DocValuesType.SORTED_NUMERIC) {
+throw new IllegalArgumentException("SortField " + sortField + " 
expected field [" + field + "] to be SORTED_NUMERIC

[GitHub] [lucene-solr] mocobeta opened a new pull request #1450: SOLR-14429: Convert XXX.txt files to proper XXX.md

2020-04-23 Thread GitBox



mocobeta opened a new pull request #1450:
URL: https://github.com/apache/lucene-solr/pull/1450


   
   
   
   # Description
   
   Converted all README.txt to README.md and partially fixed its formatting (as 
proper markdown). I also fixed pointers to the files.
   
   See https://issues.apache.org/jira/browse/SOLR-14429
   
   # Tests
   
   - Distribution package built by `ant create-package` includes the all .md 
files
   - TODO: run smoke test
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] mocobeta commented on a change in pull request #1449: LUCENE-9344: Convert XXX.txt files to proper XXX.md

2020-04-23 Thread GitBox



mocobeta commented on a change in pull request #1449:
URL: https://github.com/apache/lucene-solr/pull/1449#discussion_r413662279



##
File path: lucene/SYSTEM_REQUIREMENTS.md
##
@@ -14,5 +14,5 @@ implementing Lucene (document size, number of documents, and 
number of
 hits retrieved to name a few). The benchmarks page has some information 
 related to performance on particular platforms. 
 
-*To build Apache Lucene from source, refer to the `BUILD.txt` file in 
+*To build Apache Lucene from the source, refer to the `BUILD.txt` file in

Review comment:
   My linter complained, so fixed the phrasing as it suggested; can be 
reverted.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] mocobeta commented on a change in pull request #1449: LUCENE-9344: Convert XXX.txt files to proper XXX.md

2020-04-23 Thread GitBox



mocobeta commented on a change in pull request #1449:
URL: https://github.com/apache/lucene-solr/pull/1449#discussion_r413661856



##
File path: lucene/JRE_VERSION_MIGRATION.md
##
@@ -19,16 +19,16 @@ For reference, JRE major versions with their corresponding 
Unicode versions:
  * Java 8, Unicode 6.2
  * Java 9, Unicode 8.0
 
-In general, whether or not you need to re-index largely depends upon the data 
that
+In general, whether you need to re-index largely depends upon the data that

Review comment:
   My linter complained, so fixed the phrasing as it suggested; can be 
reverted.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] mocobeta commented on a change in pull request #1449: LUCENE-9344: Convert XXX.txt files to proper XXX.md

2020-04-23 Thread GitBox



mocobeta commented on a change in pull request #1449:
URL: https://github.com/apache/lucene-solr/pull/1449#discussion_r413662067



##
File path: lucene/JRE_VERSION_MIGRATION.md
##
@@ -19,16 +19,16 @@ For reference, JRE major versions with their corresponding 
Unicode versions:
  * Java 8, Unicode 6.2
  * Java 9, Unicode 8.0
 
-In general, whether or not you need to re-index largely depends upon the data 
that
+In general, whether you need to re-index largely depends upon the data that
 you are searching, and what was changed in any given Unicode version. For 
example, 
-if you are completely sure that your content is limited to the "Basic Latin" 
range 
+if you are completely sure your content is limited to the "Basic Latin" range
 of Unicode, you can safely ignore this. 
 
 ## Special Notes: LUCENE 2.9 TO 3.0, JAVA 1.4 TO JAVA 5 TRANSITION
 
 * `StandardAnalyzer` will return the same results under Java 5 as it did under 
 Java 1.4. This is because it is largely independent of the runtime JRE for
-Unicode support, (with the exception of lowercasing).  However, no changes to 
+Unicode support, (except for lowercasing).  However, no changes to

Review comment:
   My linter complained, so fixed the phrasing as it suggested; can be 
reverted.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] mocobeta commented on a change in pull request #1449: LUCENE-9344: Convert XXX.txt files to proper XXX.md

2020-04-23 Thread GitBox



mocobeta commented on a change in pull request #1449:
URL: https://github.com/apache/lucene-solr/pull/1449#discussion_r413660560



##
File path: lucene/BUILD.md
##
@@ -66,7 +66,7 @@ system.
 
 NOTE: the ~ character represents your user account home directory.
 
-Step 3) Run ant
+## Step 4) Run ant

Review comment:
   Actually it's step 4.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] mocobeta opened a new pull request #1449: LUCENE-9344: Convert XXX.txt files to proper XXX.md

2020-04-23 Thread GitBox



mocobeta opened a new pull request #1449:
URL: https://github.com/apache/lucene-solr/pull/1449


   
   
   
   # Description
   
   Changed README.txt, MIGRATE.txt, etc. to `.md` and partially fix its 
markdown formatting. 
   LICENCE.txt and NOTICE.txt were not modified.
   
   See https://issues.apache.org/jira/browse/LUCENE-9344
   
   # Tests
   
   - Distribution package built by `ant package-tgz` includes the all .md files
   - TODO: run smoke test 
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-12182) Can not switch urlScheme in 7x if there are any cores in the cluster

2020-04-23 Thread Geza Nagy (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-12182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geza Nagy updated SOLR-12182:
-
Attachment: SOLR-12182_20200423.patch

> Can not switch urlScheme in 7x if there are any cores in the cluster
> 
>
> Key: SOLR-12182
> URL: https://issues.apache.org/jira/browse/SOLR-12182
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 7.0, 7.1, 7.2
>Reporter: Anshum Gupta
>Priority: Major
> Attachments: SOLR-12182.patch, SOLR-12182_20200423.patch
>
>
> I was trying to enable TLS on a cluster that was already in use i.e. had 
> existing collections and ended up with down cores, that wouldn't come up and 
> the following core init errors in the logs:
> *org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: 
> replica with coreNodeName core_node4 exists but with a different name or 
> base_url.*
> What is happening here is that the core/replica is defined in the 
> clusterstate with the urlScheme as part of it's base URL e.g. 
> *"base_url":"http:hostname:port/solr"*.
> Switching the urlScheme in Solr breaks this convention as the host now uses 
> HTTPS instead.
> Actually, I ran into this with an older version because I was running with 
> *legacyCloud=false* and then realized that we switched that to the default 
> behavior only in 7x i.e while most users did not hit this issue with older 
> versions, unless they overrode the legacyCloud value explicitly, users 
> running 7x are bound to run into this more often.
> Switching the value of legacyCloud to true, bouncing the cluster so that the 
> clusterstate gets flushed, and then setting it back to false is a workaround 
> but a bit risky one if you don't know if you have any old cores lying around.
> Ideally, I think we shouldn't prepend the urlScheme to the base_url value and 
> use the urlScheme on the fly to construct it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-12182) Can not switch urlScheme in 7x if there are any cores in the cluster

2020-04-23 Thread Geza Nagy (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-12182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090432#comment-17090432
 ] 

Geza Nagy commented on SOLR-12182:
--

Hi,

I'm working on this, I uploaded a patch. I've put my changes into the 
ZKSyncTool class and I attached the sh for starting it. Originally It's made 
for synchronize the security json to ensure it's content in ZK. I extended it 
to looking for and correcting wrong base urls replica by replica. It collects 
the information from env variables, I guess it should been modified to read 
system properties or other sources maybe. 

> Can not switch urlScheme in 7x if there are any cores in the cluster
> 
>
> Key: SOLR-12182
> URL: https://issues.apache.org/jira/browse/SOLR-12182
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 7.0, 7.1, 7.2
>Reporter: Anshum Gupta
>Priority: Major
> Attachments: SOLR-12182.patch
>
>
> I was trying to enable TLS on a cluster that was already in use i.e. had 
> existing collections and ended up with down cores, that wouldn't come up and 
> the following core init errors in the logs:
> *org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: 
> replica with coreNodeName core_node4 exists but with a different name or 
> base_url.*
> What is happening here is that the core/replica is defined in the 
> clusterstate with the urlScheme as part of it's base URL e.g. 
> *"base_url":"http:hostname:port/solr"*.
> Switching the urlScheme in Solr breaks this convention as the host now uses 
> HTTPS instead.
> Actually, I ran into this with an older version because I was running with 
> *legacyCloud=false* and then realized that we switched that to the default 
> behavior only in 7x i.e while most users did not hit this issue with older 
> versions, unless they overrode the legacyCloud value explicitly, users 
> running 7x are bound to run into this more often.
> Switching the value of legacyCloud to true, bouncing the cluster so that the 
> clusterstate gets flushed, and then setting it back to false is a workaround 
> but a bit risky one if you don't know if you have any old cores lying around.
> Ideally, I think we shouldn't prepend the urlScheme to the base_url value and 
> use the urlScheme on the fly to construct it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14428) FuzzyQuery has severe memory usage in 8.5

2020-04-23 Thread Colvin Cowie (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colvin Cowie updated SOLR-14428:

Description: 
I sent this to the mailing list

I'm moving from 8.3.1 to 8.5.1, and started getting Out Of Memory Errors while 
running our normal tests. After profiling it was clear that the majority of the 
heap was allocated through FuzzyQuery.
LUCENE-9068 moved construction of the automata from the FuzzyTermsEnum to the 
FuzzyQuery's constructor.

I created a little test ( [^FuzzyHammer.java] ) that fires off fuzzy queries 
from random UUID strings for 5 minutes

{code}
FIELD_NAME + ":" + UUID.randomUUID().toString().replace("-", "") + "~2"
{code}

When running against a vanilla Solr 8.31 and 8.4.1 there is no problem, while 
the memory usage has increased drastically on 8.5.0 and 8.5.1.

Comparison of heap usage while running the attached test against Solr 8.3.1 and 
8.5.1 with a single (empty) shard and 4GB heap:
!image-2020-04-23-09-18-06-070.png! 
And with 4 shards on 8.4.1 and 8.5.0:
 !screenshot-2.png! 

I'm guessing that the memory might be being leaked if the FuzzyQuery objects 
are referenced from the cache, while the FuzzyTermsEnum would not have been.

Query Result Cache on 8.5.1:
 !screenshot-3.png! 
~316mb in the cache

QRC on 8.3.1
 !screenshot-4.png! 
<1mb

  was:
I sent this to the mailing list

I'm moving from 8.3.1 to 8.5.1, and started getting Out Of Memory Errors while 
running our normal tests. After profiling it was clear that the majority of the 
heap was allocated through FuzzyQuery.
LUCENE-9068 moved construction of the automata from the FuzzyTermsEnum to the 
FuzzyQuery's constructor.

I created a little test ( [^FuzzyHammer.java] ) that fires off fuzzy queries 
from random UUID strings for 5 minutes

{code}
FIELD_NAME + ":" + UUID.randomUUID().toString().replace("-", "") + "~2"
{code}

When running against a vanilla Solr 8.31 and 8.4.1 there is no problem, while 
the memory usage has increased drastically on 8.5.0 and 8.5.1.

Comparison of heap usage while running the attached test against Solr 8.3.1 and 
8.5.1 with a single (empty) shard and 4GB heap:
!image-2020-04-23-09-18-06-070.png! 
And with 4 shards on 8.4.1 and 8.5.0:
 !screenshot-2.png! 

I'm guessing that the memory might be being leaked if the FuzzyQuery objects 
are referenced from the cache, while the FuzzyTermsEnum would not have been.

Query Result Cache on 8.5.1:
 !screenshot-3.png! 
~316mb in the cache


> FuzzyQuery has severe memory usage in 8.5
> -
>
> Key: SOLR-14428
> URL: https://issues.apache.org/jira/browse/SOLR-14428
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 8.5, 8.5.1
>Reporter: Colvin Cowie
>Priority: Major
> Attachments: FuzzyHammer.java, image-2020-04-23-09-18-06-070.png, 
> screenshot-2.png, screenshot-3.png, screenshot-4.png
>
>
> I sent this to the mailing list
> I'm moving from 8.3.1 to 8.5.1, and started getting Out Of Memory Errors 
> while running our normal tests. After profiling it was clear that the 
> majority of the heap was allocated through FuzzyQuery.
> LUCENE-9068 moved construction of the automata from the FuzzyTermsEnum to the 
> FuzzyQuery's constructor.
> I created a little test ( [^FuzzyHammer.java] ) that fires off fuzzy queries 
> from random UUID strings for 5 minutes
> {code}
> FIELD_NAME + ":" + UUID.randomUUID().toString().replace("-", "") + "~2"
> {code}
> When running against a vanilla Solr 8.31 and 8.4.1 there is no problem, while 
> the memory usage has increased drastically on 8.5.0 and 8.5.1.
> Comparison of heap usage while running the attached test against Solr 8.3.1 
> and 8.5.1 with a single (empty) shard and 4GB heap:
> !image-2020-04-23-09-18-06-070.png! 
> And with 4 shards on 8.4.1 and 8.5.0:
>  !screenshot-2.png! 
> I'm guessing that the memory might be being leaked if the FuzzyQuery objects 
> are referenced from the cache, while the FuzzyTermsEnum would not have been.
> Query Result Cache on 8.5.1:
>  !screenshot-3.png! 
> ~316mb in the cache
> QRC on 8.3.1
>  !screenshot-4.png! 
> <1mb



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (SOLR-14429) Convert XXX.txt files to proper XXX.md

2020-04-23 Thread Tomoko Uchida (Jira)

Tomoko Uchida created SOLR-14429:


 Summary: Convert XXX.txt files to proper XXX.md
 Key: SOLR-14429
 URL: https://issues.apache.org/jira/browse/SOLR-14429
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Tomoko Uchida
Assignee: Tomoko Uchida


"README.txt" files are (partially) written in markdown and can be converted to 
proper markdown files. This change was suggested on LUCENE-9321.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (LUCENE-9344) Convert XXX.txt files to proper XXX.md

2020-04-23 Thread Tomoko Uchida (Jira)

Tomoko Uchida created LUCENE-9344:
-

 Summary: Convert  XXX.txt files to proper XXX.md
 Key: LUCENE-9344
 URL: https://issues.apache.org/jira/browse/LUCENE-9344
 Project: Lucene - Core
  Issue Type: Improvement
Affects Versions: master (9.0)
Reporter: Tomoko Uchida
Assignee: Tomoko Uchida


Text files that are (partially) written in markdown (such as "README.txt") can 
be converted to proper markdown files. This change was suggested on LUCENE-9321.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14428) FuzzyQuery has severe memory usage in 8.5

2020-04-23 Thread Colvin Cowie (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colvin Cowie updated SOLR-14428:

Attachment: screenshot-4.png

> FuzzyQuery has severe memory usage in 8.5
> -
>
> Key: SOLR-14428
> URL: https://issues.apache.org/jira/browse/SOLR-14428
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 8.5, 8.5.1
>Reporter: Colvin Cowie
>Priority: Major
> Attachments: FuzzyHammer.java, image-2020-04-23-09-18-06-070.png, 
> screenshot-2.png, screenshot-3.png, screenshot-4.png
>
>
> I sent this to the mailing list
> I'm moving from 8.3.1 to 8.5.1, and started getting Out Of Memory Errors 
> while running our normal tests. After profiling it was clear that the 
> majority of the heap was allocated through FuzzyQuery.
> LUCENE-9068 moved construction of the automata from the FuzzyTermsEnum to the 
> FuzzyQuery's constructor.
> I created a little test ( [^FuzzyHammer.java] ) that fires off fuzzy queries 
> from random UUID strings for 5 minutes
> {code}
> FIELD_NAME + ":" + UUID.randomUUID().toString().replace("-", "") + "~2"
> {code}
> When running against a vanilla Solr 8.31 and 8.4.1 there is no problem, while 
> the memory usage has increased drastically on 8.5.0 and 8.5.1.
> Comparison of heap usage while running the attached test against Solr 8.3.1 
> and 8.5.1 with a single (empty) shard and 4GB heap:
> !image-2020-04-23-09-18-06-070.png! 
> And with 4 shards on 8.4.1 and 8.5.0:
>  !screenshot-2.png! 
> I'm guessing that the memory might be being leaked if the FuzzyQuery objects 
> are referenced from the cache, while the FuzzyTermsEnum would not have been.
> Query Result Cache on 8.5.1:
>  !screenshot-3.png! 
> ~316mb in the cache



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14428) FuzzyQuery has severe memory usage in 8.5

2020-04-23 Thread Colvin Cowie (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colvin Cowie updated SOLR-14428:

Attachment: screenshot-3.png

> FuzzyQuery has severe memory usage in 8.5
> -
>
> Key: SOLR-14428
> URL: https://issues.apache.org/jira/browse/SOLR-14428
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 8.5, 8.5.1
>Reporter: Colvin Cowie
>Priority: Major
> Attachments: FuzzyHammer.java, image-2020-04-23-09-18-06-070.png, 
> screenshot-2.png, screenshot-3.png
>
>
> I sent this to the mailing list
> I'm moving from 8.3.1 to 8.5.1, and started getting Out Of Memory Errors 
> while running our normal tests. After profiling it was clear that the 
> majority of the heap was allocated through FuzzyQuery.
> LUCENE-9068 moved construction of the automata from the FuzzyTermsEnum to the 
> FuzzyQuery's constructor.
> I created a little test ( [^FuzzyHammer.java] ) that fires off fuzzy queries 
> from random UUID strings for 5 minutes
> {code}
> FIELD_NAME + ":" + UUID.randomUUID().toString().replace("-", "") + "~2"
> {code}
> When running against a vanilla Solr 8.31 and 8.4.1 there is no problem, while 
> the memory usage has increased drastically on 8.5.0 and 8.5.1.
> Comparison of heap usage while running the attached test against Solr 8.3.1 
> and 8.5.1 with a single (empty) shard and 4GB heap:
> !image-2020-04-23-09-18-06-070.png! 
> And with 4 shards on 8.4.1 and 8.5.0:
>  !screenshot-2.png! 
> I'm guessing that the memory might be being leaked if the FuzzyQuery objects 
> are referenced from the cache, while the FuzzyTermsEnum would not have been.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14428) FuzzyQuery has severe memory usage in 8.5

2020-04-23 Thread Colvin Cowie (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colvin Cowie updated SOLR-14428:

Description: 
I sent this to the mailing list

I'm moving from 8.3.1 to 8.5.1, and started getting Out Of Memory Errors while 
running our normal tests. After profiling it was clear that the majority of the 
heap was allocated through FuzzyQuery.
LUCENE-9068 moved construction of the automata from the FuzzyTermsEnum to the 
FuzzyQuery's constructor.

I created a little test ( [^FuzzyHammer.java] ) that fires off fuzzy queries 
from random UUID strings for 5 minutes

{code}
FIELD_NAME + ":" + UUID.randomUUID().toString().replace("-", "") + "~2"
{code}

When running against a vanilla Solr 8.31 and 8.4.1 there is no problem, while 
the memory usage has increased drastically on 8.5.0 and 8.5.1.

Comparison of heap usage while running the attached test against Solr 8.3.1 and 
8.5.1 with a single (empty) shard and 4GB heap:
!image-2020-04-23-09-18-06-070.png! 
And with 4 shards on 8.4.1 and 8.5.0:
 !screenshot-2.png! 

I'm guessing that the memory might be being leaked if the FuzzyQuery objects 
are referenced from the cache, while the FuzzyTermsEnum would not have been.

Query Result Cache on 8.5.1:
 !screenshot-3.png! 
~316mb in the cache

  was:
I sent this to the mailing list

I'm moving from 8.3.1 to 8.5.1, and started getting Out Of Memory Errors while 
running our normal tests. After profiling it was clear that the majority of the 
heap was allocated through FuzzyQuery.
LUCENE-9068 moved construction of the automata from the FuzzyTermsEnum to the 
FuzzyQuery's constructor.

I created a little test ( [^FuzzyHammer.java] ) that fires off fuzzy queries 
from random UUID strings for 5 minutes

{code}
FIELD_NAME + ":" + UUID.randomUUID().toString().replace("-", "") + "~2"
{code}

When running against a vanilla Solr 8.31 and 8.4.1 there is no problem, while 
the memory usage has increased drastically on 8.5.0 and 8.5.1.

Comparison of heap usage while running the attached test against Solr 8.3.1 and 
8.5.1 with a single (empty) shard and 4GB heap:
!image-2020-04-23-09-18-06-070.png! 
And with 4 shards on 8.4.1 and 8.5.0:
 !screenshot-2.png! 

I'm guessing that the memory might be being leaked if the FuzzyQuery objects 
are referenced from the cache, while the FuzzyTermsEnum would not have been.


> FuzzyQuery has severe memory usage in 8.5
> -
>
> Key: SOLR-14428
> URL: https://issues.apache.org/jira/browse/SOLR-14428
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 8.5, 8.5.1
>Reporter: Colvin Cowie
>Priority: Major
> Attachments: FuzzyHammer.java, image-2020-04-23-09-18-06-070.png, 
> screenshot-2.png, screenshot-3.png
>
>
> I sent this to the mailing list
> I'm moving from 8.3.1 to 8.5.1, and started getting Out Of Memory Errors 
> while running our normal tests. After profiling it was clear that the 
> majority of the heap was allocated through FuzzyQuery.
> LUCENE-9068 moved construction of the automata from the FuzzyTermsEnum to the 
> FuzzyQuery's constructor.
> I created a little test ( [^FuzzyHammer.java] ) that fires off fuzzy queries 
> from random UUID strings for 5 minutes
> {code}
> FIELD_NAME + ":" + UUID.randomUUID().toString().replace("-", "") + "~2"
> {code}
> When running against a vanilla Solr 8.31 and 8.4.1 there is no problem, while 
> the memory usage has increased drastically on 8.5.0 and 8.5.1.
> Comparison of heap usage while running the attached test against Solr 8.3.1 
> and 8.5.1 with a single (empty) shard and 4GB heap:
> !image-2020-04-23-09-18-06-070.png! 
> And with 4 shards on 8.4.1 and 8.5.0:
>  !screenshot-2.png! 
> I'm guessing that the memory might be being leaked if the FuzzyQuery objects 
> are referenced from the cache, while the FuzzyTermsEnum would not have been.
> Query Result Cache on 8.5.1:
>  !screenshot-3.png! 
> ~316mb in the cache



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] jpountz commented on a change in pull request #1444: LUCENE-9338: Clean up type safety in SimpleBindings

2020-04-23 Thread GitBox



jpountz commented on a change in pull request #1444:
URL: https://github.com/apache/lucene-solr/pull/1444#discussion_r412947903



##
File path: 
lucene/expressions/src/java/org/apache/lucene/expressions/ExpressionValueSource.java
##
@@ -42,13 +42,17 @@
 this.expression = Objects.requireNonNull(expression);
 variables = new DoubleValuesSource[expression.variables.length];
 boolean needsScores = false;
-for (int i = 0; i < variables.length; i++) {
-  DoubleValuesSource source = 
bindings.getDoubleValuesSource(expression.variables[i]);
-  if (source == null) {
-throw new RuntimeException("Internal error. Variable (" + 
expression.variables[i] + ") does not exist.");
+try {
+  for (int i = 0; i < variables.length; i++) {
+DoubleValuesSource source = 
bindings.getDoubleValuesSource(expression.variables[i]);
+if (source == null) {
+  throw new RuntimeException("Internal error. Variable (" + 
expression.variables[i] + ") does not exist.");
+}
+needsScores |= source.needsScores();
+variables[i] = source;
   }
-  needsScores |= source.needsScores();
-  variables[i] = source;
+} catch (StackOverflowError e) {

Review comment:
   Hmm this is a pre-existing issues but catching stack overflows is 
usually a bad idea as it might leave objects in an inconsistent state. I wonder 
if it could be checked differently. Also is it ok to move the catch from 
validate() to here?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-12690) Regularize LoggerFactory declarations

2020-04-23 Thread Andrzej Bialecki (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-12690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090409#comment-17090409
 ] 

Andrzej Bialecki commented on SOLR-12690:
-

Good catch David! Indeed, it should be {{TLOG}} .

> Regularize LoggerFactory declarations
> -
>
> Key: SOLR-12690
> URL: https://issues.apache.org/jira/browse/SOLR-12690
> Project: Solr
>  Issue Type: Improvement
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Minor
> Fix For: 7.5, 8.0
>
> Attachments: SOLR-12690.patch, SOLR-12690.patch
>
>
> LoggerFactory declarations have several different forms, they should all be:
> private static final Logger log = 
> LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
>  * lowercase log
>  * private static
>  * non hard-coded class lookup.
> I'm going to regularize all of these, I think there are about 80 currently, 
> we've been nibbling away at this but I'll try to do it in one go.
> [~cpoerschke] I think there's another Jira about this that I can't find just 
> now, ring any bells?
> Once that's done, is there a good way to make violations of this fail 
> precommit?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14428) FuzzyQuery has severe memory usage in 8.5

2020-04-23 Thread Colvin Cowie (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colvin Cowie updated SOLR-14428:

Attachment: (was: screenshot-1.png)

> FuzzyQuery has severe memory usage in 8.5
> -
>
> Key: SOLR-14428
> URL: https://issues.apache.org/jira/browse/SOLR-14428
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 8.5, 8.5.1
>Reporter: Colvin Cowie
>Priority: Major
> Attachments: FuzzyHammer.java, image-2020-04-23-09-18-06-070.png, 
> screenshot-2.png
>
>
> I sent this to the mailing list
> I'm moving from 8.3.1 to 8.5.1, and started getting Out Of Memory Errors 
> while running our normal tests. After profiling it was clear that the 
> majority of the heap was allocated through FuzzyQuery.
> LUCENE-9068 moved construction of the automata from the FuzzyTermsEnum to the 
> FuzzyQuery's constructor.
> I created a little test ( [^FuzzyHammer.java] ) that fires off fuzzy queries 
> from random UUID strings for 5 minutes
> {code}
> FIELD_NAME + ":" + UUID.randomUUID().toString().replace("-", "") + "~2"
> {code}
> When running against a vanilla Solr 8.31 and 8.4.1 there is no problem, while 
> the memory usage has increased drastically on 8.5.0 and 8.5.1.
> Comparison of heap usage while running the attached test against Solr 8.3.1 
> and 8.5.1 with a single (empty) shard and 4GB heap:
> !image-2020-04-23-09-18-06-070.png! 
> And with 4 shards on 8.4.1 and 8.5.0:
>  !screenshot-2.png! 
> I'm guessing that the memory might be being leaked if the FuzzyQuery objects 
> are referenced from the cache, while the FuzzyTermsEnum would not have been.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

1 2 >

1 - 100 of 111 matches

Mail list logo