[GitHub] [lucene-solr] interma opened a new pull request #2289: Title: add timestamp in default gc_log name

2021-02-01 Thread GitBox


interma opened a new pull request #2289:
URL: https://github.com/apache/lucene-solr/pull/2289


   
   
   
   # Description
   https://issues.apache.org/jira/browse/SOLR-15104
   
   When restarting Solr, it will overwrite the gc log, this behavior is not 
friendly for debugging OOM issues.
   
   # Solution
   
   Add timestamp in default gc_log name, so it doesn't overwrite the previous 
one.
   
   # Tests
   
   Please describe the tests you've developed or run to confirm this patch 
implements the feature or solves the problem.
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [ ] I have reviewed the guidelines for [How to 
Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms 
to the standards described there to the best of my ability.
   - [ ] I have created a Jira issue and added the issue ID to my pull request 
title.
   - [ ] I have given Solr maintainers 
[access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork)
 to contribute to my PR branch. (optional but recommended)
   - [ ] I have developed this patch against the `master` branch.
   - [ ] I have run `./gradlew check`.
   - [ ] I have added tests for my changes.
   - [ ] I have added documentation for the [Ref 
Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) 
(for Solr changes only).
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] sbeniwal12 commented on a change in pull request #2282: LUCENE-9615: Expose HnswGraphBuilder index-time hyperparameters as FieldType attributes

2021-02-01 Thread GitBox


sbeniwal12 commented on a change in pull request #2282:
URL: https://github.com/apache/lucene-solr/pull/2282#discussion_r568327126



##
File path: 
lucene/core/src/java/org/apache/lucene/codecs/lucene90/Lucene90VectorWriter.java
##
@@ -188,9 +190,29 @@ private void writeGraph(
   RandomAccessVectorValuesProducer vectorValues,
   long graphDataOffset,
   long[] offsets,
-  int count)
+  int count,
+  String maxConnStr,
+  String beamWidthStr)
   throws IOException {
-HnswGraphBuilder hnswGraphBuilder = new HnswGraphBuilder(vectorValues);
+int maxConn, beamWidth;
+if (maxConnStr == null) {
+  maxConn = HnswGraphBuilder.DEFAULT_MAX_CONN;
+} else if (!maxConnStr.matches("[0-9]+")) {

Review comment:
   Thanks for pointing out, threw `NumberFormatException` with message 
describing which attribute caused the exception.

##
File path: lucene/core/src/java/org/apache/lucene/document/VectorField.java
##
@@ -53,6 +54,44 @@ private static FieldType getType(float[] v, 
VectorValues.SearchStrategy searchSt
 return type;
   }
 
+  /**
+   * Public method to create HNSW field type with the given max-connections 
and beam-width
+   * parameters that would be used by HnswGraphBuilder while constructing HNSW 
graph.
+   *
+   * @param dimension dimension of vectors
+   * @param searchStrategy a function defining vector proximity.
+   * @param maxConn max-connections at each HNSW graph node
+   * @param beamWidth size of list to be used while constructing HNSW graph
+   * @throws IllegalArgumentException if any parameter is null, or has 
dimension  1024.
+   */
+  public static FieldType createHnswType(
+  int dimension, VectorValues.SearchStrategy searchStrategy, int maxConn, 
int beamWidth) {
+if (dimension == 0) {
+  throw new IllegalArgumentException("cannot index an empty vector");
+}
+if (dimension > VectorValues.MAX_DIMENSIONS) {
+  throw new IllegalArgumentException(
+  "cannot index vectors with dimension greater than " + 
VectorValues.MAX_DIMENSIONS);
+}
+if (searchStrategy == null) {
+  throw new IllegalArgumentException("search strategy must not be null");

Review comment:
   Added this check and also added a unit test for this check. 

##
File path: lucene/core/src/test/org/apache/lucene/util/hnsw/KnnGraphTester.java
##
@@ -132,13 +135,13 @@ private void run(String... args) throws Exception {
   if (iarg == args.length - 1) {
 throw new IllegalArgumentException("-beamWidthIndex requires a 
following number");
   }
-  HnswGraphBuilder.DEFAULT_BEAM_WIDTH = Integer.parseInt(args[++iarg]);

Review comment:
   Made them final and also made changes to `TestKnnGraph.java` to 
accommodate this change. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] zacharymorn commented on pull request #2258: LUCENE-9686: Fix read past EOF handling in DirectIODirectory

2021-02-01 Thread GitBox


zacharymorn commented on pull request #2258:
URL: https://github.com/apache/lucene-solr/pull/2258#issuecomment-771322610


   > Hi Zach. Sorry for belated reply. Please take a look at my comments 
attached to the context. I have some doubts whether EOF should leave the 
channel undrained. Maybe I'm paranoid here though.
   
   Hi Dawid, no worry and thanks for the review! I've replied to the comment 
and added some tests to verify, please let me know if they look good to you.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] zacharymorn commented on a change in pull request #2258: LUCENE-9686: Fix read past EOF handling in DirectIODirectory

2021-02-01 Thread GitBox


zacharymorn commented on a change in pull request #2258:
URL: https://github.com/apache/lucene-solr/pull/2258#discussion_r568294041



##
File path: 
lucene/misc/src/java/org/apache/lucene/misc/store/DirectIODirectory.java
##
@@ -381,17 +377,18 @@ public long length() {
 @Override
 public byte readByte() throws IOException {
   if (!buffer.hasRemaining()) {
-refill();
+refill(1);
   }
+
   return buffer.get();
 }
 
-private void refill() throws IOException {
+private void refill(int byteToRead) throws IOException {
   filePos += buffer.capacity();
 
   // BaseDirectoryTestCase#testSeekPastEOF test for consecutive read past 
EOF,
   // hence throwing EOFException early to maintain buffer state (position 
in particular)
-  if (filePos > channel.size()) {
+  if (filePos > channel.size() || (channel.size() - filePos < byteToRead)) 
{

Review comment:
   If I understand your comment correctly, your concern is about the 
consistency of directory's internal state after EOF is raised right? I think 
DirectIODirectory already handles that actually (by manipulating `filePos`, but 
not `channel.position` per se), and I have added some more tests to confirm 
that to be the case in the latest commit. 
   
   Please note that for the additional tests, I was originally adding them into 
`BaseDirectoryTestCase#testSeekPastEOF`, but that would fail some existing 
tests for other directory implementations, as read immediately after seek past 
EOF doesn't raise EOFException for them:
   
   * TestHardLinkCopyDirectoryWrapper
   * TestMmapDirectory
   * TestByteBuffersDirectory
   * TestMultiMMap
   
   However, according to java doc here 
https://github.com/apache/lucene-solr/blob/15aaec60d9bfa96f2837c38b7ca83e2c87c66d8d/lucene/core/src/java/org/apache/lucene/store/IndexInput.java#L66-L73,
 this seems to be an unspecified state in general. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] zacharymorn commented on a change in pull request #2258: LUCENE-9686: Fix read past EOF handling in DirectIODirectory

2021-02-01 Thread GitBox


zacharymorn commented on a change in pull request #2258:
URL: https://github.com/apache/lucene-solr/pull/2258#discussion_r568293170



##
File path: 
lucene/misc/src/java/org/apache/lucene/misc/store/DirectIODirectory.java
##
@@ -381,17 +377,18 @@ public long length() {
 @Override
 public byte readByte() throws IOException {
   if (!buffer.hasRemaining()) {
-refill();
+refill(1);
   }
+
   return buffer.get();
 }
 
-private void refill() throws IOException {
+private void refill(int byteToRead) throws IOException {

Review comment:
   Updated.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9718) REGEX Pattern Search, character classes with quantifiers do not work

2021-02-01 Thread Michael Sokolov (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276798#comment-17276798
 ] 

Michael Sokolov commented on LUCENE-9718:
-

Thanks Brian, contributions in those areas would be welcome! 

> REGEX Pattern Search, character classes with quantifiers do not work
> 
>
> Key: LUCENE-9718
> URL: https://issues.apache.org/jira/browse/LUCENE-9718
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 7.7.3, 8.6.3
>Reporter: Brian Feldman
>Priority: Minor
>  Labels: Documentation, RegEx
>
> Character classes with a quantifier do not work, no error is given and no 
> results are returned. For example \d\{2} or \d\{2,3} as is commonly written 
> in most languages supporting regular expressions, simply and quietly does not 
> work.  A user work around is to write them fully out such as \d\d or 
> [0-9][0-9] or as [0-9]\{2,3} .
>  
> This inconsistency or limitation is not documented, wasting the time of users 
> as they have to figure this out themselves. I believe this inconsistency 
> should be clearly documented and an effort to fixing the inconsistency would 
> improve pattern searching.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9718) REGEX Pattern Search, character classes with quantifiers do not work

2021-02-01 Thread Michael Sokolov (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Sokolov updated LUCENE-9718:

Issue Type: Improvement  (was: Bug)

> REGEX Pattern Search, character classes with quantifiers do not work
> 
>
> Key: LUCENE-9718
> URL: https://issues.apache.org/jira/browse/LUCENE-9718
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Affects Versions: 7.7.3, 8.6.3
>Reporter: Brian Feldman
>Priority: Minor
>  Labels: Documentation, RegEx
>
> Character classes with a quantifier do not work, no error is given and no 
> results are returned. For example \d\{2} or \d\{2,3} as is commonly written 
> in most languages supporting regular expressions, simply and quietly does not 
> work.  A user work around is to write them fully out such as \d\d or 
> [0-9][0-9] or as [0-9]\{2,3} .
>  
> This inconsistency or limitation is not documented, wasting the time of users 
> as they have to figure this out themselves. I believe this inconsistency 
> should be clearly documented and an effort to fixing the inconsistency would 
> improve pattern searching.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jaisonbi edited a comment on pull request #2213: LUCENE-9663: Adding compression to terms dict from SortedSet/Sorted DocValues

2021-02-01 Thread GitBox


jaisonbi edited a comment on pull request #2213:
URL: https://github.com/apache/lucene-solr/pull/2213#issuecomment-771296030


   If I understood correctly, the route via PerFieldDocValuesFormat need to 
change the usage of SortedSetDocValues.
   The idea is adding another constructor for enabling terms dict compression, 
as below:
   ```
   public SortedSetDocValuesField(String name, BytesRef bytes, boolean 
compression) {
   super(name, compression ? COMPRESSION_TYPE: TYPE);
   fieldsData = bytes;
   }
   ```
   And below is the definition of COMPRESSION_TYPE:
   
 ```
   public static final FieldType COMPRESSION_TYPE = new FieldType();
 static {
   COMPRESSION_TYPE.setDocValuesType(DocValuesType.SORTED_SET);
   // add one new attribute for telling PerFieldDocValuesFormat that terms 
dict compression is enabled for this field
   COMPRESSION_TYPE.putAttribute("docvalue.sortedset.compression", "true");
   COMPRESSION_TYPE.freeze();
 }
   ```
   Not sure if I've got it right :)
   
   @msokolov @bruno-roustant 
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jaisonbi commented on pull request #2213: LUCENE-9663: Adding compression to terms dict from SortedSet/Sorted DocValues

2021-02-01 Thread GitBox


jaisonbi commented on pull request #2213:
URL: https://github.com/apache/lucene-solr/pull/2213#issuecomment-771296030


   If I understood correctly, the route via PerFieldDocValuesFormat need to 
change the usage of SortedSetDocValues.
   The idea is adding another constructor for enabling terms dict compression, 
as below:
   ```
   public SortedSetDocValuesField(String name, BytesRef bytes, boolean 
compression) {
   super(name, compression ? COMPRESSION_TYPE: TYPE);
   fieldsData = bytes;
   }
   ```
   In COMPRESSION_TYPE, add one new attribute for telling 
PerFieldDocValuesFormat that terms dict compression is enabled for this field.  
 Not sure if I've got it right :)
   @msokolov @bruno-roustant 
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-15124) Remove node/container level admin handlers from ImplicitPlugins.json (core level).

2021-02-01 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-15124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276789#comment-17276789
 ] 

David Smiley commented on SOLR-15124:
-

{quote}Should we consider setting up redirect paths for the old handlers? Or a 
better error message with a hint that they have moved?
{quote}
No; this is 9.0 and the fact that these are registered here is an obscure 
oddity.  Let's not make removal of tech-debt too hard please, or we will 
increasingly won't bother because it's too much of a PITA, and then we're left 
with an even worse tech-debt problem in the years to come (from my experience 
here, looking back 10+ years).
{quote}Also, will need to update SolrCoreTest.testImplicitPlugins, which I 
don't think the existing PR did.
{quote}
[~nazerke] did you run tests?

> Remove node/container level admin handlers from ImplicitPlugins.json (core 
> level).
> --
>
> Key: SOLR-15124
> URL: https://issues.apache.org/jira/browse/SOLR-15124
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: David Smiley
>Priority: Blocker
>  Labels: newdev
> Fix For: master (9.0)
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> There are many very old administrative RequestHandlers registered in a 
> SolrCore that are actually JVM / node / CoreContainer level in nature.  These 
> pre-dated CoreContainer level handlers.  We should (1) remove them from 
> ImplictPlugins.json, and (2) make simplifying tweaks to them to remove that 
> they work at the core level.  For example LoggingHandler has two constructors 
> and a non-final Watcher because it works in these two modalities.  It need 
> only have the one that takes a CoreContainer, and Watcher will then be final.
> /admin/threads
> /admin/properties
> /admin/logging
> Should stay because has core-level stuff:
> /admin/plugins
> /admin/mbeans
> This one:
> /admin/system -- SystemInfoHandler
> returns "core" level information, and also node level stuff.  I propose 
> splitting this one to a CoreInfoHandler to split the logic.  Maybe a separate 
> issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9722) Aborted merge can leak readers if the output is empty

2021-02-01 Thread Nhat Nguyen (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nhat Nguyen updated LUCENE-9722:

Description: 
We fail to close the merged readers of an aborted merge if its output segment 
contains no document.

This bug was discovered by a test in Elasticsearch 
([elastic/elasticsearch#67884|https://github.com/elastic/elasticsearch/issues/67884]).

  was:We fail to close merged readers if the output segment contains no 
document.


> Aborted merge can leak readers if the output is empty
> -
>
> Key: LUCENE-9722
> URL: https://issues.apache.org/jira/browse/LUCENE-9722
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: master (9.0), 8.7
>Reporter: Nhat Nguyen
>Assignee: Nhat Nguyen
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We fail to close the merged readers of an aborted merge if its output segment 
> contains no document.
> This bug was discovered by a test in Elasticsearch 
> ([elastic/elasticsearch#67884|https://github.com/elastic/elasticsearch/issues/67884]).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9722) Aborted merge can leak readers if the output is empty

2021-02-01 Thread Nhat Nguyen (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nhat Nguyen updated LUCENE-9722:

Status: Patch Available  (was: Open)

> Aborted merge can leak readers if the output is empty
> -
>
> Key: LUCENE-9722
> URL: https://issues.apache.org/jira/browse/LUCENE-9722
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: master (9.0), 8.7
>Reporter: Nhat Nguyen
>Assignee: Nhat Nguyen
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We fail to close merged readers if the output segment contains no document.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dnhatn opened a new pull request #2288: LUCENE-9722: Close merged readers on abort

2021-02-01 Thread GitBox


dnhatn opened a new pull request #2288:
URL: https://github.com/apache/lucene-solr/pull/2288


   We fail to close merged readers if the output segment contains no document.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-9722) Aborted merge can leak readers if the output is empty

2021-02-01 Thread Nhat Nguyen (Jira)
Nhat Nguyen created LUCENE-9722:
---

 Summary: Aborted merge can leak readers if the output is empty
 Key: LUCENE-9722
 URL: https://issues.apache.org/jira/browse/LUCENE-9722
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Affects Versions: 8.7, master (9.0)
Reporter: Nhat Nguyen
Assignee: Nhat Nguyen


We fail to close merged readers if the output segment contains no document.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] zhaih commented on a change in pull request #2213: LUCENE-9663: Adding compression to terms dict from SortedSet/Sorted DocValues

2021-02-01 Thread GitBox


zhaih commented on a change in pull request #2213:
URL: https://github.com/apache/lucene-solr/pull/2213#discussion_r568246420



##
File path: 
lucene/core/src/java/org/apache/lucene/codecs/lucene80/Lucene80DocValuesConsumer.java
##
@@ -736,49 +736,92 @@ private void doAddSortedField(FieldInfo field, 
DocValuesProducer valuesProducer)
   private void addTermsDict(SortedSetDocValues values) throws IOException {
 final long size = values.getValueCount();
 meta.writeVLong(size);
-meta.writeInt(Lucene80DocValuesFormat.TERMS_DICT_BLOCK_SHIFT);
+boolean compress =
+Lucene80DocValuesFormat.Mode.BEST_COMPRESSION == mode

Review comment:
   Sorry for late response. I agree we could solve it in a follow-up issue. 
And I could still test this via a customized PerFieldDocValuesFormat, thank you!





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] zhaih commented on pull request #2213: LUCENE-9663: Adding compression to terms dict from SortedSet/Sorted DocValues

2021-02-01 Thread GitBox


zhaih commented on pull request #2213:
URL: https://github.com/apache/lucene-solr/pull/2213#issuecomment-771265310


   I see, so I think for now I could test it via a customized 
PerFieldDocValuesFormat, I'll give PerFieldDocValuesFormat route a try then.
   
   Tho IMO I would prefer a simpler configuration (as proposed by @jaisonbi) 
rather than customize using PerFieldDocValuesFormat in the future, if these 2 
compression are showing different performance characteristic. Since if my 
understand is correct, to enable only TermDictCompression using 
PerFieldDOcValuesFormat we need to enumerate all SSDV field names in that 
class? Which sounds not quite maintainable if there's regularly field 
addition/deletion. Please correct me if I'm wrong as I'm not quite familiar 
with codec part...



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14330) Return docs with null value in expand for field when collapse has nullPolicy=collapse

2021-02-01 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276749#comment-17276749
 ] 

ASF subversion and git services commented on SOLR-14330:


Commit 4a21f594c203fb219942dcbaebbd872dcb2cfd4d in lucene-solr's branch 
refs/heads/branch_8x from Chris M. Hostetter
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=4a21f59 ]

SOLR-14330: ExpandComponent now supports an expand.nullGroup=true option

(cherry picked from commit 15aaec60d9bfa96f2837c38b7ca83e2c87c66d8d)


> Return docs with null value in expand for field when collapse has 
> nullPolicy=collapse
> -
>
> Key: SOLR-14330
> URL: https://issues.apache.org/jira/browse/SOLR-14330
> Project: Solr
>  Issue Type: Wish
>Reporter: Munendra S N
>Assignee: Chris M. Hostetter
>Priority: Major
> Attachments: SOLR-14330.patch, SOLR-14330.patch
>
>
> When documents doesn't contain value for field then, with collapse either 
> those documents could be either ignored(default), collapsed(one document is 
> chosen) or expanded(all are returned). This is controlled by {{nullPolicy}}
> When {{nullPolicy}} is {{collapse}}, it would be nice to return all documents 
> with {{null}} value in expand block if {{expand=true}}
> Also, when used with {{expand.field}}, even then we should return such 
> documents



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (SOLR-14330) Return docs with null value in expand for field when collapse has nullPolicy=collapse

2021-02-01 Thread Chris M. Hostetter (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris M. Hostetter resolved SOLR-14330.
---
Fix Version/s: 8.9
   master (9.0)
   Resolution: Fixed

> Return docs with null value in expand for field when collapse has 
> nullPolicy=collapse
> -
>
> Key: SOLR-14330
> URL: https://issues.apache.org/jira/browse/SOLR-14330
> Project: Solr
>  Issue Type: Wish
>Reporter: Munendra S N
>Assignee: Chris M. Hostetter
>Priority: Major
> Fix For: master (9.0), 8.9
>
> Attachments: SOLR-14330.patch, SOLR-14330.patch
>
>
> When documents doesn't contain value for field then, with collapse either 
> those documents could be either ignored(default), collapsed(one document is 
> chosen) or expanded(all are returned). This is controlled by {{nullPolicy}}
> When {{nullPolicy}} is {{collapse}}, it would be nice to return all documents 
> with {{null}} value in expand block if {{expand=true}}
> Also, when used with {{expand.field}}, even then we should return such 
> documents



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14330) Return docs with null value in expand for field when collapse has nullPolicy=collapse

2021-02-01 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276725#comment-17276725
 ] 

ASF subversion and git services commented on SOLR-14330:


Commit 15aaec60d9bfa96f2837c38b7ca83e2c87c66d8d in lucene-solr's branch 
refs/heads/master from Chris M. Hostetter
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=15aaec6 ]

SOLR-14330: ExpandComponent now supports an expand.nullGroup=true option


> Return docs with null value in expand for field when collapse has 
> nullPolicy=collapse
> -
>
> Key: SOLR-14330
> URL: https://issues.apache.org/jira/browse/SOLR-14330
> Project: Solr
>  Issue Type: Wish
>Reporter: Munendra S N
>Assignee: Chris M. Hostetter
>Priority: Major
> Attachments: SOLR-14330.patch, SOLR-14330.patch
>
>
> When documents doesn't contain value for field then, with collapse either 
> those documents could be either ignored(default), collapsed(one document is 
> chosen) or expanded(all are returned). This is controlled by {{nullPolicy}}
> When {{nullPolicy}} is {{collapse}}, it would be nice to return all documents 
> with {{null}} value in expand block if {{expand=true}}
> Also, when used with {{expand.field}}, even then we should return such 
> documents



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dsmiley commented on a change in pull request #2230: SOLR-15011: /admin/logging handler is configured logs to all nodes

2021-02-01 Thread GitBox


dsmiley commented on a change in pull request #2230:
URL: https://github.com/apache/lucene-solr/pull/2230#discussion_r568204361



##
File path: solr/CHANGES.txt
##
@@ -69,6 +69,8 @@ Improvements
 * SOLR-14949: Docker: Ability to customize the FROM image when building.
   (Houston Putman)
 
+* SOLR-15011: /admin/logging handler should be able to configure logs on all 
nodes (Nazerke Seidan, David Smiley)

Review comment:
   ```suggestion
   * SOLR-15011: /admin/logging handler will now propagate setLevel (log 
threshold) to all nodes
   when told to. The admin UI now tells it to. (Nazerke Seidan, David Smiley)
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-9705) Move all codec formats to the o.a.l.codecs.Lucene90 package

2021-02-01 Thread Julie Tibshirani (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276599#comment-17276599
 ] 

Julie Tibshirani edited comment on LUCENE-9705 at 2/1/21, 11:02 PM:


{quote}It's especially clear here where we must copy a lot of classes with no 
change at all, merely to clearly and consistently document the index version 
change.
{quote}
I’ll try to add some context since I suspect there might be misunderstanding. 
In general when there is a new version like 9.1, we *do not* plan to create all 
new index format classes. We only copy a class and move it to backwards-codecs 
when there is a change to that specific format, for example {{PointsFormat}}. 
This proposal applies only to the 9.0 release, and its main purpose is to 
support the work in LUCENE-9047 to move all formats to little endian. My 
understanding is that moving to little endian impacts all the formats and will 
be much cleaner if we used these fresh {{Lucene90*Format}}.
{quote}I wonder if we (eventually) should consider shifting to a versioning 
system that doesn't require new classes. Is this somehow a feature of the 
service discovery API that we use?
{quote}
We indeed load codecs (with their formats) through service discovery. If a user 
wants to read indices from a previous version, they can depend on 
backwards-codecs so Lucene loads the correct older codec. As of LUCENE-9669, we 
allow reading indices back to major version N-2.

I personally really like the current "copy-on-write" system for formats. 
There’s code duplication, but it has advantages over combining different 
version logic in the same file:
 * It’s really clear how each version behaves. Having a direct copy like 
{{Lucene70Codec}} is almost as if we were pulling in jars from Lucene 7.0.
 * It decreases risk of introducing bugs or accidental changes. If you’re 
making an enhancement to a new format, there’s little chance of changing the 
logic for an old format (since it lives in a separate class). This is 
especially important since older formats are not tested as thoroughly.

I started to appreciate it after experiencing the alternative in Elasticsearch, 
where we’re constantly bumping into if/ else version checks when making changes.


was (Author: julietibs):
{quote}It's especially clear here where we must copy a lot of classes with no 
change at all, merely to clearly and consistently document the index version 
change.
{quote}
I’ll try to add some context since I suspect there might be misunderstanding. 
In general when there is a new major version, we *do not* plan to create all 
new index format classes. We only copy a class and move it to backwards-codecs 
when there is a change to that specific format, for example {{PointsFormat}}. 
This proposal applies only to the 9.0 release, and its main purpose is to 
support the work in LUCENE-9047 to move all formats to little endian. My 
understanding is that moving to little endian impacts all the formats and will 
be much cleaner if we used these fresh {{Lucene90*Format}}.
{quote}I wonder if we (eventually) should consider shifting to a versioning 
system that doesn't require new classes. Is this somehow a feature of the 
service discovery API that we use?
{quote}
We indeed load codecs (with their formats) through service discovery. If a user 
wants to read indices from a previous version, they can depend on 
backwards-codecs so Lucene loads the correct older codec. As of LUCENE-9669, we 
allow reading indices back to major version N-2.

I personally really like the current "copy-on-write" system for formats. 
There’s code duplication, but it has advantages over combining different 
version logic in the same file:
 * It’s really clear how each version behaves. Having a direct copy like 
{{Lucene70Codec}} is almost as if we were pulling in jars from Lucene 7.0.
 * It decreases risk of introducing bugs or accidental changes. If you’re 
making an enhancement to a new format, there’s little chance of changing the 
logic for an old format (since it lives in a separate class). This is 
especially important since older formats are not tested as thoroughly.

I started to appreciate it after experiencing the alternative in Elasticsearch, 
where we’re constantly bumping into if/ else version checks when making changes.

> Move all codec formats to the o.a.l.codecs.Lucene90 package
> ---
>
> Key: LUCENE-9705
> URL: https://issues.apache.org/jira/browse/LUCENE-9705
> Project: Lucene - Core
>  Issue Type: Wish
>Reporter: Ignacio Vera
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Current formats are distributed in different packages, prefixed with the 
> Lucene version they were created. With the upcoming release of Lucene 9.0, it 
> would be nice 

[jira] [Commented] (SOLR-8393) Component for Solr resource usage planning

2021-02-01 Thread Isabelle Giguere (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-8393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276715#comment-17276715
 ] 

Isabelle Giguere commented on SOLR-8393:


New patch, off current master

Parameter 'sizeUnit' is supported for both the SizeComponent, and ClusterSizing.

If parameter 'sizeUnit' is present, values will be output as 'double', 
according to the chosen size unit.
Value of 'estimated-num-docs' remains a 'long'.
Default behavior, if 'sizeUnit' is not present is the human-readable format.

Valid values for 'sizeUnit' are : GB, MB, KB, bytes

**
Note about the implementation : 
ClusterSizing calls the SizeComponent via HTTP.  So the returned results per 
collection are already formatted according to 'sizeUnit' (or lack of it).  As a 
consequence, ClusterSizing needs to toggle back and forth between 
human-readable values, and raw long values, to support the requested 'sizeUnit'.
I don't know how we could intercept the SizeComponent response, and receive 
just the long values, to make the conversion to some 'sizeUnit' just once in 
ClusterSizing, while keeping the formatting in SizeComponent, for use cases 
that would call it directly.
A response transformer ?  Would that be the right approach ?

> Component for Solr resource usage planning
> --
>
> Key: SOLR-8393
> URL: https://issues.apache.org/jira/browse/SOLR-8393
> Project: Solr
>  Issue Type: Improvement
>Reporter: Steve Molloy
>Priority: Major
> Attachments: SOLR-8393.patch, SOLR-8393.patch, SOLR-8393.patch, 
> SOLR-8393.patch, SOLR-8393.patch, SOLR-8393.patch, SOLR-8393.patch, 
> SOLR-8393.patch, SOLR-8393.patch, SOLR-8393_tag_7.5.0.patch
>
>
> One question that keeps coming back is how much disk and RAM do I need to run 
> Solr. The most common response is that it highly depends on your data. While 
> true, it makes for frustrated users trying to plan their deployments. 
> The idea I'm bringing is to create a new component that will attempt to 
> extrapolate resources needed in the future by looking at resources currently 
> used. By adding a parameter for the target number of documents, current 
> resources are adapted by a ratio relative to current number of documents.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-8393) Component for Solr resource usage planning

2021-02-01 Thread Isabelle Giguere (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-8393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Isabelle Giguere updated SOLR-8393:
---
Attachment: SOLR-8393.patch

> Component for Solr resource usage planning
> --
>
> Key: SOLR-8393
> URL: https://issues.apache.org/jira/browse/SOLR-8393
> Project: Solr
>  Issue Type: Improvement
>Reporter: Steve Molloy
>Priority: Major
> Attachments: SOLR-8393.patch, SOLR-8393.patch, SOLR-8393.patch, 
> SOLR-8393.patch, SOLR-8393.patch, SOLR-8393.patch, SOLR-8393.patch, 
> SOLR-8393.patch, SOLR-8393.patch, SOLR-8393_tag_7.5.0.patch
>
>
> One question that keeps coming back is how much disk and RAM do I need to run 
> Solr. The most common response is that it highly depends on your data. While 
> true, it makes for frustrated users trying to plan their deployments. 
> The idea I'm bringing is to create a new component that will attempt to 
> extrapolate resources needed in the future by looking at resources currently 
> used. By adding a parameter for the target number of documents, current 
> resources are adapted by a ratio relative to current number of documents.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dweiss commented on a change in pull request #2267: LUCENE-9707: Hunspell: check Lucene's implementation against Hunspel's test data

2021-02-01 Thread GitBox


dweiss commented on a change in pull request #2267:
URL: https://github.com/apache/lucene-solr/pull/2267#discussion_r568187636



##
File path: 
lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/SpellCheckerTest.java
##
@@ -61,59 +61,74 @@ public void needAffixOnAffixes() throws Exception {
 doTest("needaffix5");
   }
 
+  @Test
   public void testBreak() throws Exception {
 doTest("break");
   }
 
-  public void testBreakDefault() throws Exception {
+  @Test
+  public void breakDefault() throws Exception {
 doTest("breakdefault");
   }
 
-  public void testBreakOff() throws Exception {
+  @Test
+  public void breakOff() throws Exception {
 doTest("breakoff");
   }
 
-  public void testCompoundrule() throws Exception {
+  @Test
+  public void compoundrule() throws Exception {
 doTest("compoundrule");
   }
 
-  public void testCompoundrule2() throws Exception {
+  @Test
+  public void compoundrule2() throws Exception {
 doTest("compoundrule2");
   }
 
-  public void testCompoundrule3() throws Exception {
+  @Test
+  public void compoundrule3() throws Exception {
 doTest("compoundrule3");
   }
 
-  public void testCompoundrule4() throws Exception {
+  @Test
+  public void compoundrule4() throws Exception {
 doTest("compoundrule4");
   }
 
-  public void testCompoundrule5() throws Exception {
+  @Test
+  public void compoundrule5() throws Exception {
 doTest("compoundrule5");
   }
 
-  public void testCompoundrule6() throws Exception {
+  @Test
+  public void compoundrule6() throws Exception {
 doTest("compoundrule6");
   }
 
-  public void testCompoundrule7() throws Exception {
+  @Test
+  public void compoundrule7() throws Exception {
 doTest("compoundrule7");
   }
 
-  public void testCompoundrule8() throws Exception {
+  @Test
+  public void compoundrule8() throws Exception {
 doTest("compoundrule8");
   }
 
-  public void testGermanCompounding() throws Exception {
+  @Test
+  public void germanCompounding() throws Exception {
 doTest("germancompounding");
   }
 
   protected void doTest(String name) throws Exception {
-InputStream affixStream =
-Objects.requireNonNull(getClass().getResourceAsStream(name + ".aff"), 
name);
-InputStream dictStream =
-Objects.requireNonNull(getClass().getResourceAsStream(name + ".dic"), 
name);
+checkSpellCheckerExpectations(

Review comment:
   Ah... can't push to your repo (there is a checkbox to enable committers 
to do so - please use it, makes edits easier :). Here is the commit:
   
   
https://github.com/dweiss/lucene-solr/commit/618a2d3b5bb51eb0e35322a9c56b97bdce7d728b





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dweiss commented on a change in pull request #2267: LUCENE-9707: Hunspell: check Lucene's implementation against Hunspel's test data

2021-02-01 Thread GitBox


dweiss commented on a change in pull request #2267:
URL: https://github.com/apache/lucene-solr/pull/2267#discussion_r568186778



##
File path: 
lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/SpellCheckerTest.java
##
@@ -61,59 +61,74 @@ public void needAffixOnAffixes() throws Exception {
 doTest("needaffix5");
   }
 
+  @Test
   public void testBreak() throws Exception {
 doTest("break");
   }
 
-  public void testBreakDefault() throws Exception {
+  @Test
+  public void breakDefault() throws Exception {
 doTest("breakdefault");
   }
 
-  public void testBreakOff() throws Exception {
+  @Test
+  public void breakOff() throws Exception {
 doTest("breakoff");
   }
 
-  public void testCompoundrule() throws Exception {
+  @Test
+  public void compoundrule() throws Exception {
 doTest("compoundrule");
   }
 
-  public void testCompoundrule2() throws Exception {
+  @Test
+  public void compoundrule2() throws Exception {
 doTest("compoundrule2");
   }
 
-  public void testCompoundrule3() throws Exception {
+  @Test
+  public void compoundrule3() throws Exception {
 doTest("compoundrule3");
   }
 
-  public void testCompoundrule4() throws Exception {
+  @Test
+  public void compoundrule4() throws Exception {
 doTest("compoundrule4");
   }
 
-  public void testCompoundrule5() throws Exception {
+  @Test
+  public void compoundrule5() throws Exception {
 doTest("compoundrule5");
   }
 
-  public void testCompoundrule6() throws Exception {
+  @Test
+  public void compoundrule6() throws Exception {
 doTest("compoundrule6");
   }
 
-  public void testCompoundrule7() throws Exception {
+  @Test
+  public void compoundrule7() throws Exception {
 doTest("compoundrule7");
   }
 
-  public void testCompoundrule8() throws Exception {
+  @Test
+  public void compoundrule8() throws Exception {
 doTest("compoundrule8");
   }
 
-  public void testGermanCompounding() throws Exception {
+  @Test
+  public void germanCompounding() throws Exception {
 doTest("germancompounding");
   }
 
   protected void doTest(String name) throws Exception {
-InputStream affixStream =
-Objects.requireNonNull(getClass().getResourceAsStream(name + ".aff"), 
name);
-InputStream dictStream =
-Objects.requireNonNull(getClass().getResourceAsStream(name + ".dic"), 
name);
+checkSpellCheckerExpectations(

Review comment:
   You can't really convert resource URLs to paths with url.getPath. This 
breaks, as I suspected. On Windows you get:
   ```
   java.nio.file.InvalidPathException: Illegal char <:> at index 2: 
/C:/Work/apache/lucene/lucene.master/lucene/analysis/common/build/classes/java/test/org/apache/lucene/analysis/hunspell/i53643.aff
  > at 
__randomizedtesting.SeedInfo.seed([FE61D482FAEDBB53:CE18D8B46A2785A8]:0)
  > at 
java.base/sun.nio.fs.WindowsPathParser.normalize(WindowsPathParser.java:182)
  > at 
java.base/sun.nio.fs.WindowsPathParser.parse(WindowsPathParser.java:153)
  > at 
java.base/sun.nio.fs.WindowsPathParser.parse(WindowsPathParser.java:77)
  > at java.base/sun.nio.fs.WindowsPath.parse(WindowsPath.java:92)
  > at 
java.base/sun.nio.fs.WindowsFileSystem.getPath(WindowsFileSystem.java:229)
  > at java.base/java.nio.file.Path.of(Path.java:147)
   ```
   
   A better method is to go through the URI - Path.of(url.toUri()). I've 
modified the code slightly, please take a look.  
   
   Also, can you rename tests to follow TestXXX convention? This may be 
enforced in the future and will spare somebody some work to rename.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-15127) All-In-One Dockerfile for building local images as well as reproducible release builds directly from (remote) git tags

2021-02-01 Thread Chris M. Hostetter (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-15127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris M. Hostetter updated SOLR-15127:
--
Attachment: SOLR-15127.patch
Status: Open  (was: Open)


The attached patch implements this idea, and seems to work well -- there are 
some nocommits, but they aren't neccessarily problems that need "fixed", so 
much as comments to draw attention to some specific changes for discussion.

Basic usage is spelled out in the Dockerfile comments...

{noformat}
# This Dockerfile can be used in 2 distinct ways:
# 1) For Solr developers with a java/gradle development env, this file is used 
by gradle to build docker images
#from your local builds (with or w/o local modifications).  When doing 
this, gradle will use a docker build context
#containing pre-built artifacts from previous gradle targets
#   EX: ./gradlew -p solr/docker dockerBuild
#
# 2) Solr users, with or w/o a local java/gradle development env, can pass this 
Dockerfile directly to docker build,
#using the root level checkout of the the project -- or a remote git URL -- 
as the docker build context.  When doing
#this, docker will invoke gradle to build all neccessary artifacts
#   EX: docker build --file solr/docker/Dockerfile .
#   docker build --file solr/docker/Dockerfile 
https://gitbox.apache.org/repos/asf/lucene-solr.git
#   docker build --file solr/docker/Dockerfile 
https://gitbox.apache.org/repos/asf/lucene-solr.git#branch_9x
#
# This last format is the method used by Solr Release Managers to build the 
official apache/solr images uploaded to hub.docker.com
#
# EX: docker build --build-arg SOLR_VERSION=9.0.0 \
#  --tag apache/solr:9.0.0 \
#  --file solr/docker/Dockerfile \
#  
https://gitbox.apache.org/repos/asf/lucene-solr.git#releases/lucene-solr/9.0.0
{noformat}

...allthough the direct "docker build" usage could be drastically simplified 
once solr has it's own TLP/git repo if we're willing to keep the Dockerfile in 
the root of the repo.

[~houstonputman] / [~dsmiley] / [~janhoy]: what do you guys think of this 
overall approach?


> All-In-One Dockerfile for building local images as well as reproducible 
> release builds directly from (remote) git tags
> --
>
> Key: SOLR-15127
> URL: https://issues.apache.org/jira/browse/SOLR-15127
> Project: Solr
>  Issue Type: Sub-task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Chris M. Hostetter
>Priority: Major
> Attachments: SOLR-15127.patch
>
>
> There was a recent dev@lucene discussion about the future of the 
> github/docker-solr repo and (Apache) "official" solr docker images and using 
> the "apache/solr" nameing vs (docker-library official) "_/solr" names...
> http://mail-archives.apache.org/mod_mbox/lucene-dev/202101.mbox/%3CCAD4GwrNCPEnAJAjy4tY%3DpMeX5vWvnFyLe9ZDaXmF4J8XchA98Q%40mail.gmail.com%3E
> In that disussion, mak pointed out that docker-library evidently allows for 
> some more flexibility in the way "official" docker-library packages can be 
> built (compared to the rules that were evidnlty in place when the mak setup 
> the current docker-solr image building process/tooling), pointing out how the 
> "docker official" elasticsearch images are current built from the "elastic 
> official" elasticsearch images...
> http://mail-archives.apache.org/mod_mbox/lucene-dev/202101.mbox/%3C3CED9683-1DD2-4F08-97F9-4FC549EDE47D%40greenhills.co.uk%3E
> Based on this, I proposed that we could probably restructure the Solr 
> Dockerfile so that it could be useful for both "local development" -- using 
> the current repo checkout -- as well as for "apache official" apache/solr 
> images that could be reproducibly built directly from pristine git tags using 
> the remote git URL syntax supported by "docker build" (and then -- evidently 
> -- extended by trivial one line Dockerfiles for the "docker-library official" 
> _/solr images)...
> http://mail-archives.apache.org/mod_mbox/lucene-dev/202101.mbox/%3Calpine.DEB.2.21.2101221423340.16298%40slate%3E
> This jira tracks this idea.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dweiss commented on a change in pull request #2267: LUCENE-9707: Hunspell: check Lucene's implementation against Hunspel's test data

2021-02-01 Thread GitBox


dweiss commented on a change in pull request #2267:
URL: https://github.com/apache/lucene-solr/pull/2267#discussion_r568176360



##
File path: 
lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/TestsFromOriginalHunspellRepository.java
##
@@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.analysis.hunspell;
+
+import java.io.IOException;
+import java.nio.file.DirectoryStream;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.text.ParseException;
+import java.util.Collection;
+import java.util.Collections;
+import java.util.Set;
+import java.util.TreeSet;
+import java.util.stream.Collectors;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+/**
+ * Same as {@link SpellCheckerTest}, but checks all Hunspell's test data. The 
path to the checked
+ * out Hunspell repository should be in {@code -Dhunspell.repo.path=...} 
system property.
+ */
+@RunWith(Parameterized.class)

Review comment:
   Filed an issue for myself here: 
https://github.com/randomizedtesting/randomizedtesting/issues/295.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-15127) All-In-One Dockerfile for building local images as well as reproducible release builds directly from (remote) git tags

2021-02-01 Thread Chris M. Hostetter (Jira)
Chris M. Hostetter created SOLR-15127:
-

 Summary: All-In-One Dockerfile for building local images as well 
as reproducible release builds directly from (remote) git tags
 Key: SOLR-15127
 URL: https://issues.apache.org/jira/browse/SOLR-15127
 Project: Solr
  Issue Type: Sub-task
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Chris M. Hostetter



There was a recent dev@lucene discussion about the future of the 
github/docker-solr repo and (Apache) "official" solr docker images and using 
the "apache/solr" nameing vs (docker-library official) "_/solr" names...

http://mail-archives.apache.org/mod_mbox/lucene-dev/202101.mbox/%3CCAD4GwrNCPEnAJAjy4tY%3DpMeX5vWvnFyLe9ZDaXmF4J8XchA98Q%40mail.gmail.com%3E

In that disussion, mak pointed out that docker-library evidently allows for 
some more flexibility in the way "official" docker-library packages can be 
built (compared to the rules that were evidnlty in place when the mak setup the 
current docker-solr image building process/tooling), pointing out how the 
"docker official" elasticsearch images are current built from the "elastic 
official" elasticsearch images...

http://mail-archives.apache.org/mod_mbox/lucene-dev/202101.mbox/%3C3CED9683-1DD2-4F08-97F9-4FC549EDE47D%40greenhills.co.uk%3E

Based on this, I proposed that we could probably restructure the Solr 
Dockerfile so that it could be useful for both "local development" -- using the 
current repo checkout -- as well as for "apache official" apache/solr images 
that could be reproducibly built directly from pristine git tags using the 
remote git URL syntax supported by "docker build" (and then -- evidently -- 
extended by trivial one line Dockerfiles for the "docker-library official" 
_/solr images)...

http://mail-archives.apache.org/mod_mbox/lucene-dev/202101.mbox/%3Calpine.DEB.2.21.2101221423340.16298%40slate%3E

This jira tracks this idea.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dweiss commented on a change in pull request #2267: LUCENE-9707: Hunspell: check Lucene's implementation against Hunspel's test data

2021-02-01 Thread GitBox


dweiss commented on a change in pull request #2267:
URL: https://github.com/apache/lucene-solr/pull/2267#discussion_r568175124



##
File path: 
lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/TestsFromOriginalHunspellRepository.java
##
@@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.analysis.hunspell;
+
+import java.io.IOException;
+import java.nio.file.DirectoryStream;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.text.ParseException;
+import java.util.Collection;
+import java.util.Collections;
+import java.util.Set;
+import java.util.TreeSet;
+import java.util.stream.Collectors;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+/**
+ * Same as {@link SpellCheckerTest}, but checks all Hunspell's test data. The 
path to the checked
+ * out Hunspell repository should be in {@code -Dhunspell.repo.path=...} 
system property.
+ */
+@RunWith(Parameterized.class)

Review comment:
   I checked intellij and parameterized tests tonight. It's what I was 
afraid of - test descriptions are emitted correctly (in my opinion) but they're 
*interepreted* differently depending on the tool (and time when you check...). 
   
   The reason why you see the class name and test method before each actual 
test is because these supposedly "hidden" elements allowed tools to go back to 
the source code of a test with an arbitrary name (if you double-click on a test 
in IntelliJ it will take you back to the test method). Relaunching of a single 
test must have changed at some point because it used to be an exact name 
filter... but now I it just reruns all tests under a test method (all parameter 
variations).
   
   It's worth mentioning that this isn't consistent even in IntelliJ itself - 
if I run a simple(r) parameterized test via IntelliJ launcher, I get this test 
suite tree:
   
   
![image](https://user-images.githubusercontent.com/199470/106523393-561b9700-64e1-11eb-9000-4c4a66117331.png)
   
   But when I run the same test via gradle launcher (from within the IDE), I 
get this tree:
   
   
![image](https://user-images.githubusercontent.com/199470/106523340-3e441300-64e1-11eb-958f-dc36ece73d66.png)
   
   I don't know if there is a way to make all the tools happy; test 
descriptions and nesting is broken in JUnit 4.x itself.
   
   Given the above, please feel free to revert back to what works for you. I'd 
name the test class TestHunspellRepositoryTestCases for clarity. Also, this 
test will not run under Lucene test framework because the security manager 
won't let you access arbitrary paths outside the build location. You'd need to 
add this to tests.policy:
   ```
   permission java.io.FilePermission "${hunspell.repo.path}${/}-", "read";
   ```
   Don't know whether it's worth it at the moment though.
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (SOLR-15125) Link to docs is brroken

2021-02-01 Thread Cassandra Targett (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-15125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cassandra Targett resolved SOLR-15125.
--
Resolution: Fixed

The problem has been fixed and the docs are available again.

> Link to docs is brroken
> ---
>
> Key: SOLR-15125
> URL: https://issues.apache.org/jira/browse/SOLR-15125
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: website
>Reporter: Thomas Güttler
>Priority: Minor
>
> [On this page: 
> https://lucene.apache.org/solr/guide/|https://lucene.apache.org/solr/guide/]
> the link to [https://lucene.apache.org/solr/guide/8_8/]
> is broken.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9718) REGEX Pattern Search, character classes with quantifiers do not work

2021-02-01 Thread Brian Feldman (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276685#comment-17276685
 ] 

Brian Feldman commented on LUCENE-9718:
---

1) User level documentation upstream in Solr or ElasticSearch there is limited 
documentation. Receiving no error or results back from a search system, some 
users might simply believe no matches exist, and not that their syntax is not 
supported.  I did not realize it was an issue until playing around with it.

2) Besides being documented, the code can be improved, only the initial parsing 
code would need updating.  It does not affect logic for the running of the 
automaton. And since there is already code to support the character classes, 
logically the parsing code should be completed to support the trailing 
quantifiers, in order to finish the implementation for character classes.

> REGEX Pattern Search, character classes with quantifiers do not work
> 
>
> Key: LUCENE-9718
> URL: https://issues.apache.org/jira/browse/LUCENE-9718
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 7.7.3, 8.6.3
>Reporter: Brian Feldman
>Priority: Minor
>  Labels: Documentation, RegEx
>
> Character classes with a quantifier do not work, no error is given and no 
> results are returned. For example \d\{2} or \d\{2,3} as is commonly written 
> in most languages supporting regular expressions, simply and quietly does not 
> work.  A user work around is to write them fully out such as \d\d or 
> [0-9][0-9] or as [0-9]\{2,3} .
>  
> This inconsistency or limitation is not documented, wasting the time of users 
> as they have to figure this out themselves. I believe this inconsistency 
> should be clearly documented and an effort to fixing the inconsistency would 
> improve pattern searching.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr-operator] thelabdude commented on a change in pull request #151: Integrate with cert-manager to issue TLS certs for Solr

2021-02-01 Thread GitBox


thelabdude commented on a change in pull request #151:
URL: 
https://github.com/apache/lucene-solr-operator/pull/151#discussion_r568152014



##
File path: main.go
##
@@ -65,6 +69,7 @@ func init() {
 
_ = solrv1beta1.AddToScheme(scheme)
_ = zkv1beta1.AddToScheme(scheme)
+   _ = certv1.AddToScheme(scheme)
 
// +kubebuilder:scaffold:scheme
flag.BoolVar(, "zk-operator", true, "The operator will 
not use the zk operator & crd when this flag is set to false.")

Review comment:
   From a reconcile perspective, we really only care about the TLS secret 
that cert-manager creates once the Certificate is issued. The "watching" of the 
Certificate to come online is really for status reporting while the cert is 
issuing as it can take several minutes for the cert to be issued. Notice the 
`isCertificateReady` is mostly about checking for the TLS secret.
   
   The operator does create a Certificate for `autoCreate` mode but in that 
case, the cert definition should come from the SolrCloud CRD and we don't want 
to let users edit the Certificate externally; this is similar to the default 
`solr.xml` ConfigMap and any direct edits to that cm are lost, same with 
`autoCreate` certs.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mrsoong closed pull request #1589: SOLR-13195: added check for missing shards param in SearchHandler

2021-02-01 Thread GitBox


mrsoong closed pull request #1589:
URL: https://github.com/apache/lucene-solr/pull/1589


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mrsoong closed pull request #1472: SOLR-13184: Added some input validation in ValueSourceParser

2021-02-01 Thread GitBox


mrsoong closed pull request #1472:
URL: https://github.com/apache/lucene-solr/pull/1472


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr-operator] HoustonPutman commented on a change in pull request #151: Integrate with cert-manager to issue TLS certs for Solr

2021-02-01 Thread GitBox


HoustonPutman commented on a change in pull request #151:
URL: 
https://github.com/apache/lucene-solr-operator/pull/151#discussion_r568148914



##
File path: main.go
##
@@ -65,6 +69,7 @@ func init() {
 
_ = solrv1beta1.AddToScheme(scheme)
_ = zkv1beta1.AddToScheme(scheme)
+   _ = certv1.AddToScheme(scheme)
 
// +kubebuilder:scaffold:scheme
flag.BoolVar(, "zk-operator", true, "The operator will 
not use the zk operator & crd when this flag is set to false.")

Review comment:
   Ahh sorry for the confusion.
   
   Yeah, it looks like Solr Operator is creating its own Secrets as well as 
finding secrets created by CertManager. If that's the case, then I think we 
will need both "Owns" and "Watches" with similar logic to the ConfigMaps. But I 
may be wrong there, this is uncharted territory.
   
   Are we sure we don't need to own Certificates? We wait for them to come 
online, so we want to be notified when they have condition changes, right?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr-operator] thelabdude commented on a change in pull request #151: Integrate with cert-manager to issue TLS certs for Solr

2021-02-01 Thread GitBox


thelabdude commented on a change in pull request #151:
URL: 
https://github.com/apache/lucene-solr-operator/pull/151#discussion_r568141856



##
File path: main.go
##
@@ -65,6 +69,7 @@ func init() {
 
_ = solrv1beta1.AddToScheme(scheme)
_ = zkv1beta1.AddToScheme(scheme)
+   _ = certv1.AddToScheme(scheme)
 
// +kubebuilder:scaffold:scheme
flag.BoolVar(, "zk-operator", true, "The operator will 
not use the zk operator & crd when this flag is set to false.")

Review comment:
   Ok I see, I was confused because you put the comment on the 
`AddToScheme` line so thought the problem was about that line of code.
   
   I don't think the Solr operator needs to own `Certificate` objects ... all 
it cares about is the TLS secret that gets created by the cert-manager in 
response to a change to the `Certificate`. It seems like the secret changing 
does trigger a reconcile in my testing but maybe we need to add a specific 
watch for that secret changing like you did for user-provided ConfigMaps?
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dweiss commented on a change in pull request #2258: LUCENE-9686: Fix read past EOF handling in DirectIODirectory

2021-02-01 Thread GitBox


dweiss commented on a change in pull request #2258:
URL: https://github.com/apache/lucene-solr/pull/2258#discussion_r568136570



##
File path: 
lucene/misc/src/java/org/apache/lucene/misc/store/DirectIODirectory.java
##
@@ -381,17 +377,18 @@ public long length() {
 @Override
 public byte readByte() throws IOException {
   if (!buffer.hasRemaining()) {
-refill();
+refill(1);
   }
+
   return buffer.get();
 }
 
-private void refill() throws IOException {
+private void refill(int byteToRead) throws IOException {

Review comment:
   Should it be plural (bytesToRead)?

##
File path: 
lucene/misc/src/java/org/apache/lucene/misc/store/DirectIODirectory.java
##
@@ -381,17 +377,18 @@ public long length() {
 @Override
 public byte readByte() throws IOException {
   if (!buffer.hasRemaining()) {
-refill();
+refill(1);
   }
+
   return buffer.get();
 }
 
-private void refill() throws IOException {
+private void refill(int byteToRead) throws IOException {
   filePos += buffer.capacity();
 
   // BaseDirectoryTestCase#testSeekPastEOF test for consecutive read past 
EOF,
   // hence throwing EOFException early to maintain buffer state (position 
in particular)
-  if (filePos > channel.size()) {
+  if (filePos > channel.size() || (channel.size() - filePos < byteToRead)) 
{

Review comment:
   I wonder if we should move the channel's position to actually point 
after the last byte, then throw EOFException... so that not only we indicate an 
EOF but also leave the channel pointing at the end. I have a scenario in my 
mind when somebody tries to read a bulk of bytes, hits an eof but then a 
single-byte read() succeeds. That would be awkward, wouldn't it? 
   
   A refill should try to read as many bytes as it can (min(channel.size() 
-filePos, bytesToRead)), then potentially fail if bytesToRead is still >0 and 
channel is at EOF. Or is my thinking flawed somewhere?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-8319) NPE when creating pivot

2021-02-01 Thread Houston Putman (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-8319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276653#comment-17276653
 ] 

Houston Putman commented on SOLR-8319:
--

The query isn't coming from QParser.parse. Instead 
QueryBuilder.createFieldQuery is explicitly returning a null Query in a few 
code paths, such as when there are no tokens contained within a field search 
value. This is hit when a stopword is provided as a value.

> NPE when creating pivot
> ---
>
> Key: SOLR-8319
> URL: https://issues.apache.org/jira/browse/SOLR-8319
> Project: Solr
>  Issue Type: Bug
>Reporter: Neil Ireson
>Priority: Major
> Attachments: SOLR-8319.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I get a NPE, the trace is shown at the end.
> The problem seems to be this line in the getSubset method:
>   Query query = ft.getFieldQuery(null, field, pivotValue);
> Which takes a value from the index and then analyses it to create a query. I 
> believe the problem is that when my analysis process is applied twice it 
> results in a null query. OK this might be seen as my issue because of dodgy 
> analysis, I thought it might be because I have the wrong order with 
> LengthFilterFactory before EnglishPossessiveFilterFactory and 
> KStemFilterFactory, i.e.:
> 
> 
>  
> So that "cat's" -> "cat" -> "", however any filter order I tried still 
> resulted in a NPE, and perhaps there is a viable case where parsing a term 
> twice results in a null query.
> The thing is I don't see why when the query term comes from the index it has 
> to undergo any analysis. If the term is from the index can it not simply be 
> created using a TermQuery, which I would imagine would also be faster. I 
> altered the "getFieldQuery" line above to the following and that has fixed my 
> NPE issue.
>   Query query = new TermQuery(new Term(field.getName(), pivotValue));
> So far this hasn't caused any other issues but perhaps that is due to my use 
> of Solr, rather than actually fixing an issue. 
> o.a.s.c.SolrCore java.lang.NullPointerException
> at 
> java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:936)
> at 
> org.apache.solr.util.ConcurrentLRUCache.get(ConcurrentLRUCache.java:91)
> at org.apache.solr.search.FastLRUCache.get(FastLRUCache.java:130)
> at 
> org.apache.solr.search.SolrIndexSearcher.getDocSet(SolrIndexSearcher.java:1296)
> at 
> org.apache.solr.handler.component.PivotFacetProcessor.getSubset(PivotFacetProcessor.java:375)
> at 
> org.apache.solr.handler.component.PivotFacetProcessor.doPivots(PivotFacetProcessor.java:305)
> at 
> org.apache.solr.handler.component.PivotFacetProcessor.processSingle(PivotFacetProcessor.java:228)
> at 
> org.apache.solr.handler.component.PivotFacetProcessor.process(PivotFacetProcessor.java:170)
> at 
> org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:262)
> at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:277)
> at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:2068)
> at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:669)
> at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:462)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:214)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:179)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
> at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
> at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
> at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
> at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
> at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
> at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
> at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
> at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)
> at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)
> at 
> 

[jira] [Commented] (SOLR-13209) NullPointerException from call in org.apache.solr.search.SolrIndexSearcher.getDocSet

2021-02-01 Thread Isabelle Giguere (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276644#comment-17276644
 ] 

Isabelle Giguere commented on SOLR-13209:
-

[~cader.hancock] : Nothing to worry about.  I get lost in Solr every time I get 
back to it.

Grouping.java looks like a good place to start.  It seems the request from the 
description can go through the execute() method without error.  Maybe that's 
wanted.  But if Grouping.CommandQuery expects to work with a valid "query", 
then I think Grouping.CommandQuery.prepare() should throw an exception if 
"query" is null ?

That's just from looking at the code, so, it needs testing.

> NullPointerException from call in 
> org.apache.solr.search.SolrIndexSearcher.getDocSet
> 
>
> Key: SOLR-13209
> URL: https://issues.apache.org/jira/browse/SOLR-13209
> Project: Solr
>  Issue Type: Bug
>Affects Versions: master (9.0)
> Environment: h1. Steps to reproduce
> * Use a Linux machine.
> * Build commit {{ea2c8ba}} of Solr as described in the section below.
> * Build the films collection as described below.
> * Start the server using the command {{./bin/solr start -f -p 8983 -s 
> /tmp/home}}
> * Request the URL given in the bug description.
> h1. Compiling the server
> {noformat}
> git clone https://github.com/apache/lucene-solr
> cd lucene-solr
> git checkout ea2c8ba
> ant compile
> cd solr
> ant server
> {noformat}
> h1. Building the collection and reproducing the bug
> We followed [Exercise 
> 2|http://lucene.apache.org/solr/guide/7_5/solr-tutorial.html#exercise-2] from 
> the [Solr 
> Tutorial|http://lucene.apache.org/solr/guide/7_5/solr-tutorial.html].
> {noformat}
> mkdir -p /tmp/home
> echo '' > 
> /tmp/home/solr.xml
> {noformat}
> In one terminal start a Solr instance in foreground:
> {noformat}
> ./bin/solr start -f -p 8983 -s /tmp/home
> {noformat}
> In another terminal, create a collection of movies, with no shards and no 
> replication, and initialize it:
> {noformat}
> bin/solr create -c films
> curl -X POST -H 'Content-type:application/json' --data-binary '{"add-field": 
> {"name":"name", "type":"text_general", "multiValued":false, "stored":true}}' 
> http://localhost:8983/solr/films/schema
> curl -X POST -H 'Content-type:application/json' --data-binary 
> '{"add-copy-field" : {"source":"*","dest":"_text_"}}' 
> http://localhost:8983/solr/films/schema
> ./bin/post -c films example/films/films.json
> curl -v “URL_BUG”
> {noformat}
> Please check the issue description below to find the “URL_BUG” that will 
> allow you to reproduce the issue reported.
>Reporter: Cesar Rodriguez
>Priority: Minor
>  Labels: diffblue, newdev
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Requesting the following URL causes Solr to return an HTTP 500 error response:
> {noformat}
> http://localhost:8983/solr/films/select?group=true
> {noformat}
> The error response seems to be caused by the following uncaught exception:
> {noformat}
>  java.lang.NullPointerException
>   at 
> java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:936)
>   at 
> org.apache.solr.util.ConcurrentLRUCache.get(ConcurrentLRUCache.java:124)
>   at org.apache.solr.search.FastLRUCache.get(FastLRUCache.java:163)
>   at 
> org.apache.solr.search.SolrIndexSearcher.getDocSet(SolrIndexSearcher.java:792)
>   at 
> org.apache.solr.search.Grouping$CommandQuery.createFirstPassCollector(Grouping.java:860)
>   at org.apache.solr.search.Grouping.execute(Grouping.java:327)
>   at 
> org.apache.solr.handler.component.QueryComponent.doProcessGroupedSearch(QueryComponent.java:1408)
>   at 
> org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:365)
>   at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:298)
>   at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:199)
>   at org.apache.solr.core.SolrCore.execute(SolrCore.java:2559)
> [...]
> {noformat}
> Method {{org.apache.solr.search.SolrIndexSearcher.getDocSet()}}, at line 792 
> calls {{filterCache.get(absQ)}} where {{absQ}} is a null pointer. I think 
> this null pointer comes in fact from the caller, but I don't fully follow the 
> logic of the code.
> To set up an environment to reproduce this bug, follow the description in the 
> ‘Environment’ field.
> We automatically found this issue and ~70 more like this using [Diffblue 
> Microservices Testing|https://www.diffblue.com/labs/?utm_source=solr-br]. 
> Find more information on this [fuzz testing 
> campaign|https://www.diffblue.com/blog/2018/12/19/diffblue-microservice-testing-a-sneak-peek-at-our-early-product-and-results?utm_source=solr-br].



--
This message was sent by Atlassian 

[GitHub] [lucene-solr-operator] thelabdude commented on a change in pull request #151: Integrate with cert-manager to issue TLS certs for Solr

2021-02-01 Thread GitBox


thelabdude commented on a change in pull request #151:
URL: 
https://github.com/apache/lucene-solr-operator/pull/151#discussion_r568133031



##
File path: controllers/solrcloud_controller.go
##
@@ -772,3 +848,188 @@ func (r *SolrCloudReconciler) 
indexAndWatchForProvidedConfigMaps(mgr ctrl.Manage
},

builder.WithPredicates(predicate.ResourceVersionChangedPredicate{})), nil
 }
+
+// Reconciles the TLS cert, returns either a bool to indicate if the cert is 
ready or an error
+func (r *SolrCloudReconciler) reconcileAutoCreateTLS(ctx context.Context, 
instance *solr.SolrCloud) (bool, error) {
+
+   // short circuit this method with a quick check if the cert exists and 
is ready
+   // this is useful b/c it may take many minutes for a cert to be issued, 
so we avoid
+   // all the other checking that happens below while we're waiting for 
the cert
+   foundCert := {}
+   if err := r.Get(ctx, types.NamespacedName{Name: 
instance.Spec.SolrTLS.AutoCreate.Name, Namespace: instance.Namespace}, 
foundCert); err == nil {
+   // cert exists, but is it ready? need to wait until we see the 
TLS secret
+   if foundTLSSecret := r.isCertificateReady(ctx, foundCert, 
instance.Spec.SolrTLS); foundTLSSecret != nil {
+   cert := util.GenerateCertificate(instance)
+   return r.afterCertificateReady(ctx, instance, , 
foundCert, foundTLSSecret)
+   }
+   }
+
+   r.Log.Info("Reconciling TLS config", "tls", instance.Spec.SolrTLS)
+
+   // cert not found, do full reconcile for TLS ...
+   var err error
+   var tlsReady bool
+
+   // First, create the keystore password secret if needed
+   keystoreSecret := util.GenerateKeystoreSecret(instance)
+   foundSecret := {}
+   err = r.Get(ctx, types.NamespacedName{Name: keystoreSecret.Name, 
Namespace: keystoreSecret.Namespace}, foundSecret)
+   if err != nil && errors.IsNotFound(err) {
+   r.Log.Info("Creating keystore secret", "namespace", 
keystoreSecret.Namespace, "name", keystoreSecret.Name)
+   if err := controllerutil.SetControllerReference(instance, 
, r.scheme); err != nil {
+   return false, err
+   }
+   err = r.Create(ctx, )
+   }
+   if err != nil {
+   return false, err
+   }
+
+   // Create a self-signed cert issuer if no issuerRef provided
+   if instance.Spec.SolrTLS.AutoCreate.IssuerRef == nil {
+   issuerName := fmt.Sprintf("%s-selfsigned-issuer", instance.Name)
+   foundIssuer := {}
+   err = r.Get(ctx, types.NamespacedName{Name: issuerName, 
Namespace: instance.Namespace}, foundIssuer)
+   if err != nil && errors.IsNotFound(err) {
+   // specified Issuer not found, let's go create a 
self-signed for this
+   issuer := util.GenerateSelfSignedIssuer(instance, 
issuerName)
+   if err := 
controllerutil.SetControllerReference(instance, , r.scheme); err != nil {
+   return false, err
+   }
+   r.Log.Info("Creating Self-signed Certificate Issuer", 
"issuer", issuer)
+   err = r.Create(ctx, )
+   } else if err == nil {
+   r.Log.Info("Found Self-signed Certificate Issuer", 
"issuer", issuerName)
+   }
+   if err != nil {
+   return false, err
+   }
+   } else {
+   // real problems arise if we create the Certificate and the 
Issuer doesn't exist so make we have a good config here
+   if instance.Spec.SolrTLS.AutoCreate.IssuerRef.Kind == "Issuer" {
+   foundIssuer := {}
+   err = r.Get(ctx, types.NamespacedName{Name: 
instance.Spec.SolrTLS.AutoCreate.IssuerRef.Name, Namespace: 
instance.Namespace}, foundIssuer)
+   if err != nil {
+   if errors.IsNotFound(err) {
+   r.Log.Info("cert-manager Issuer not 
found in namespace, cannot create a TLS certificate without an Issuer",
+   "issuer", 
instance.Spec.SolrTLS.AutoCreate.IssuerRef.Name, "ns", instance.Namespace)
+   }
+   return false, err
+   }
+   } // else assume ClusterIssuer and good luck
+   }
+
+   // Reconcile the Certificate to use for TLS ... A Certificate is a 
request to Issue the cert, the
+   // actual cert lives in a TLS secret created by the Issuer
+   cert := util.GenerateCertificate(instance)
+   err = r.Get(ctx, types.NamespacedName{Name: cert.Name, Namespace: 
cert.Namespace}, foundCert)
+   if err != nil && errors.IsNotFound(err) {
+  

[GitHub] [lucene-solr-operator] thelabdude commented on a change in pull request #151: Integrate with cert-manager to issue TLS certs for Solr

2021-02-01 Thread GitBox


thelabdude commented on a change in pull request #151:
URL: 
https://github.com/apache/lucene-solr-operator/pull/151#discussion_r568132449



##
File path: controllers/solrcloud_controller.go
##
@@ -772,3 +848,188 @@ func (r *SolrCloudReconciler) 
indexAndWatchForProvidedConfigMaps(mgr ctrl.Manage
},

builder.WithPredicates(predicate.ResourceVersionChangedPredicate{})), nil
 }
+
+// Reconciles the TLS cert, returns either a bool to indicate if the cert is 
ready or an error
+func (r *SolrCloudReconciler) reconcileAutoCreateTLS(ctx context.Context, 
instance *solr.SolrCloud) (bool, error) {
+
+   // short circuit this method with a quick check if the cert exists and 
is ready
+   // this is useful b/c it may take many minutes for a cert to be issued, 
so we avoid
+   // all the other checking that happens below while we're waiting for 
the cert
+   foundCert := {}
+   if err := r.Get(ctx, types.NamespacedName{Name: 
instance.Spec.SolrTLS.AutoCreate.Name, Namespace: instance.Namespace}, 
foundCert); err == nil {
+   // cert exists, but is it ready? need to wait until we see the 
TLS secret
+   if foundTLSSecret := r.isCertificateReady(ctx, foundCert, 
instance.Spec.SolrTLS); foundTLSSecret != nil {
+   cert := util.GenerateCertificate(instance)
+   return r.afterCertificateReady(ctx, instance, , 
foundCert, foundTLSSecret)
+   }
+   }
+
+   r.Log.Info("Reconciling TLS config", "tls", instance.Spec.SolrTLS)
+
+   // cert not found, do full reconcile for TLS ...
+   var err error
+   var tlsReady bool
+
+   // First, create the keystore password secret if needed
+   keystoreSecret := util.GenerateKeystoreSecret(instance)
+   foundSecret := {}
+   err = r.Get(ctx, types.NamespacedName{Name: keystoreSecret.Name, 
Namespace: keystoreSecret.Namespace}, foundSecret)
+   if err != nil && errors.IsNotFound(err) {
+   r.Log.Info("Creating keystore secret", "namespace", 
keystoreSecret.Namespace, "name", keystoreSecret.Name)
+   if err := controllerutil.SetControllerReference(instance, 
, r.scheme); err != nil {
+   return false, err
+   }
+   err = r.Create(ctx, )
+   }
+   if err != nil {
+   return false, err
+   }
+
+   // Create a self-signed cert issuer if no issuerRef provided
+   if instance.Spec.SolrTLS.AutoCreate.IssuerRef == nil {
+   issuerName := fmt.Sprintf("%s-selfsigned-issuer", instance.Name)
+   foundIssuer := {}
+   err = r.Get(ctx, types.NamespacedName{Name: issuerName, 
Namespace: instance.Namespace}, foundIssuer)
+   if err != nil && errors.IsNotFound(err) {
+   // specified Issuer not found, let's go create a 
self-signed for this
+   issuer := util.GenerateSelfSignedIssuer(instance, 
issuerName)
+   if err := 
controllerutil.SetControllerReference(instance, , r.scheme); err != nil {
+   return false, err
+   }
+   r.Log.Info("Creating Self-signed Certificate Issuer", 
"issuer", issuer)
+   err = r.Create(ctx, )
+   } else if err == nil {
+   r.Log.Info("Found Self-signed Certificate Issuer", 
"issuer", issuerName)
+   }
+   if err != nil {
+   return false, err
+   }
+   } else {
+   // real problems arise if we create the Certificate and the 
Issuer doesn't exist so make we have a good config here
+   if instance.Spec.SolrTLS.AutoCreate.IssuerRef.Kind == "Issuer" {
+   foundIssuer := {}
+   err = r.Get(ctx, types.NamespacedName{Name: 
instance.Spec.SolrTLS.AutoCreate.IssuerRef.Name, Namespace: 
instance.Namespace}, foundIssuer)
+   if err != nil {
+   if errors.IsNotFound(err) {
+   r.Log.Info("cert-manager Issuer not 
found in namespace, cannot create a TLS certificate without an Issuer",
+   "issuer", 
instance.Spec.SolrTLS.AutoCreate.IssuerRef.Name, "ns", instance.Namespace)
+   }
+   return false, err
+   }
+   } // else assume ClusterIssuer and good luck
+   }
+
+   // Reconcile the Certificate to use for TLS ... A Certificate is a 
request to Issue the cert, the
+   // actual cert lives in a TLS secret created by the Issuer
+   cert := util.GenerateCertificate(instance)
+   err = r.Get(ctx, types.NamespacedName{Name: cert.Name, Namespace: 
cert.Namespace}, foundCert)
+   if err != nil && errors.IsNotFound(err) {
+  

[GitHub] [lucene-solr-operator] thelabdude commented on a change in pull request #151: Integrate with cert-manager to issue TLS certs for Solr

2021-02-01 Thread GitBox


thelabdude commented on a change in pull request #151:
URL: 
https://github.com/apache/lucene-solr-operator/pull/151#discussion_r568131545



##
File path: controllers/util/common.go
##
@@ -248,6 +248,11 @@ func CopyIngressFields(from, to *extv1.Ingress, logger 
logr.Logger) bool {
}
}
 
+   if !requireUpdate && !DeepEqualWithNils(to.Spec.TLS, from.Spec.TLS) {

Review comment:
   just a mistake





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr-operator] HoustonPutman commented on a change in pull request #151: Integrate with cert-manager to issue TLS certs for Solr

2021-02-01 Thread GitBox


HoustonPutman commented on a change in pull request #151:
URL: 
https://github.com/apache/lucene-solr-operator/pull/151#discussion_r568129098



##
File path: main.go
##
@@ -65,6 +69,7 @@ func init() {
 
_ = solrv1beta1.AddToScheme(scheme)
_ = zkv1beta1.AddToScheme(scheme)
+   _ = certv1.AddToScheme(scheme)
 
// +kubebuilder:scaffold:scheme
flag.BoolVar(, "zk-operator", true, "The operator will 
not use the zk operator & crd when this flag is set to false.")

Review comment:
   Add to scheme is fine, the boolean is used when setting up the 
controller and determining whether to set "Owns" for that resource.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr-operator] thelabdude commented on a change in pull request #151: Integrate with cert-manager to issue TLS certs for Solr

2021-02-01 Thread GitBox


thelabdude commented on a change in pull request #151:
URL: 
https://github.com/apache/lucene-solr-operator/pull/151#discussion_r568127921



##
File path: main.go
##
@@ -65,6 +69,7 @@ func init() {
 
_ = solrv1beta1.AddToScheme(scheme)
_ = zkv1beta1.AddToScheme(scheme)
+   _ = certv1.AddToScheme(scheme)
 
// +kubebuilder:scaffold:scheme
flag.BoolVar(, "zk-operator", true, "The operator will 
not use the zk operator & crd when this flag is set to false.")

Review comment:
   As far as I've seen, calling `AddToScheme` works even if the 
Cert-manager CRDs are not installed so not sure a flag is technically needed 
but doesn't hurt to add one.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-9705) Move all codec formats to the o.a.l.codecs.Lucene90 package

2021-02-01 Thread Julie Tibshirani (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276599#comment-17276599
 ] 

Julie Tibshirani edited comment on LUCENE-9705 at 2/1/21, 8:44 PM:
---

{quote}It's especially clear here where we must copy a lot of classes with no 
change at all, merely to clearly and consistently document the index version 
change.
{quote}
I’ll try to add some context since I suspect there might be misunderstanding. 
In general when there is a new major version, we *do not* plan to create all 
new index format classes. We only copy a class and move it to backwards-codecs 
when there is a change to that specific format, for example {{PointsFormat}}. 
This proposal applies only to the 9.0 release, and its main purpose is to 
support the work in LUCENE-9047 to move all formats to little endian. My 
understanding is that moving to little endian impacts all the formats and will 
be much cleaner if we used these fresh {{Lucene90*Format}}.
{quote}I wonder if we (eventually) should consider shifting to a versioning 
system that doesn't require new classes. Is this somehow a feature of the 
service discovery API that we use?
{quote}
We indeed load codecs (with their formats) through service discovery. If a user 
wants to read indices from a previous version, they can depend on 
backwards-codecs so Lucene loads the correct older codec. As of LUCENE-9669, we 
allow reading indices back to major version N-2.

I personally really like the current "copy-on-write" system for formats. 
There’s code duplication, but it has advantages over combining different 
version logic in the same file:
 * It’s really clear how each version behaves. Having a direct copy like 
{{Lucene70Codec}} is almost as if we were pulling in jars from Lucene 7.0.
 * It decreases risk of introducing bugs or accidental changes. If you’re 
making an enhancement to a new format, there’s little chance of changing the 
logic for an old format (since it lives in a separate class). This is 
especially important since older formats are not tested as thoroughly.

I started to appreciate it after experiencing the alternative in Elasticsearch, 
where we’re constantly bumping into if/ else version checks when making changes.


was (Author: julietibs):
{quote}It's especially clear here where we must copy a lot of classes with no 
change at all, merely to clearly and consistently document the index version 
change.
{quote}
I’ll try to add some context since I suspect there might be misunderstanding. 
In general when there is a new major version, we *do not* plan to create all 
new index format classes. We only copy a class and move it to backwards-codecs 
when there is a change to that specific format, for example {{PointsFormat}}. 
This proposal applies only to the 9.0 release, and its main purpose is to 
support the work in LUCENE-9047 to move all formats to little endian. My 
understanding is that moving to little endian impacts all the formats and will 
be much cleaner if we used these fresh {{Lucene90*Format}}.
{quote}I wonder if we (eventually) should consider shifting to a versioning 
system that doesn't require new classes. Is this somehow a feature of the 
service discovery API that we use?
{quote}
We indeed load codecs (with their formats) through a service discovery API. If 
a user wants to read indices from a previous version, they can depend on 
backwards-codecs so Lucene loads the correct older codec. As of LUCENE-9669, we 
allow reading indices back to major version N-2.

I personally really like the current "copy-on-write" system for formats. 
There’s code duplication, but it has advantages over combining different 
version logic in the same file:
 * It’s really clear how each version behaves. Having a direct copy like 
{{Lucene70Codec}} is almost as if we were pulling in the codec jars from Lucene 
7.0.
 * It decreases risk of introducing bugs or accidental changes. If you’re 
making an enhancement to a new format, there’s little chance of changing the 
logic for an old format (since it lives in a separate class). This is 
especially important since older formats are not tested as thoroughly.

I started to appreciate it after experiencing the alternative in Elasticsearch, 
where we’re constantly bumping into if/ else version checks when making changes.

> Move all codec formats to the o.a.l.codecs.Lucene90 package
> ---
>
> Key: LUCENE-9705
> URL: https://issues.apache.org/jira/browse/LUCENE-9705
> Project: Lucene - Core
>  Issue Type: Wish
>Reporter: Ignacio Vera
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Current formats are distributed in different packages, prefixed with the 
> Lucene version they were created. With the upcoming release of Lucene 9.0, it 
> 

[GitHub] [lucene-solr-operator] thelabdude commented on a change in pull request #151: Integrate with cert-manager to issue TLS certs for Solr

2021-02-01 Thread GitBox


thelabdude commented on a change in pull request #151:
URL: 
https://github.com/apache/lucene-solr-operator/pull/151#discussion_r568114059



##
File path: controllers/solrcloud_controller.go
##
@@ -261,12 +268,77 @@ func (r *SolrCloudReconciler) Reconcile(req ctrl.Request) 
(ctrl.Result, error) {
blockReconciliationOfStatefulSet = true
}
 
+   tlsCertMd5 := ""
+   needsPkcs12InitContainer := false // flag if the StatefulSet needs an 
additional initCont to create PKCS12 keystore
+   // don't start reconciling TLS until we have ZK connectivity, avoids 
TLS code having to check for ZK
+   if !blockReconciliationOfStatefulSet && instance.Spec.SolrTLS != nil {
+   ctx := context.TODO()
+   // Create the autogenerated TLS Cert and wait for it to be 
issued
+   if instance.Spec.SolrTLS.AutoCreate != nil {
+   tlsReady, err := r.reconcileAutoCreateTLS(ctx, instance)
+   // don't create the StatefulSet until we have a cert, 
which can take a while for a Let's Encrypt Issuer
+   if !tlsReady || err != nil {
+   if err != nil {
+   r.Log.Error(err, "Reconcile TLS 
Certificate failed")
+   } else {
+   wait := 30 * time.Second
+   if 
instance.Spec.SolrTLS.AutoCreate.IssuerRef == nil {
+   // this is a self-signed cert, 
so no need to wait very long for it to issue
+   wait = 2 * time.Second
+   }
+   requeueOrNot.RequeueAfter = wait
+   }
+   return requeueOrNot, err

Review comment:
   Certs can take several minutes to issue, so I think we want to return 
here with the extended wait period otherwise you get a ton of noise in the logs 
until the cert issues ...





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] msokolov commented on a change in pull request #2256: LUCENE-9507 Custom order for leaves in IndexReader and IndexWriter

2021-02-01 Thread GitBox


msokolov commented on a change in pull request #2256:
URL: https://github.com/apache/lucene-solr/pull/2256#discussion_r568113101



##
File path: lucene/core/src/java/org/apache/lucene/index/IndexWriter.java
##
@@ -941,6 +969,11 @@ public IndexWriter(Directory d, IndexWriterConfig conf) 
throws IOException {
 // obtain the write.lock. If the user configured a timeout,
 // we wrap with a sleeper and this might take some time.
 writeLock = d.obtainLock(WRITE_LOCK_NAME);
+if (config.getIndexSort() != null && leafSorter != null) {
+  throw new IllegalArgumentException(
+  "[IndexWriter] can't use index sort and leaf sorter at the same 
time!");

Review comment:
   Hmm I do see where you said we want to use `leafSorter` to sort 
documents. I guess I might challenge that design in favor of building on the 
`indexSort` we already have? But perhaps this is better sorted out in the 
context of a later PR, as you suggested.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] msokolov commented on a change in pull request #2256: LUCENE-9507 Custom order for leaves in IndexReader and IndexWriter

2021-02-01 Thread GitBox


msokolov commented on a change in pull request #2256:
URL: https://github.com/apache/lucene-solr/pull/2256#discussion_r568110673



##
File path: lucene/core/src/java/org/apache/lucene/index/IndexWriter.java
##
@@ -941,6 +969,11 @@ public IndexWriter(Directory d, IndexWriterConfig conf) 
throws IOException {
 // obtain the write.lock. If the user configured a timeout,
 // we wrap with a sleeper and this might take some time.
 writeLock = d.obtainLock(WRITE_LOCK_NAME);
+if (config.getIndexSort() != null && leafSorter != null) {
+  throw new IllegalArgumentException(
+  "[IndexWriter] can't use index sort and leaf sorter at the same 
time!");

Review comment:
   OK, maybe I misunderstood the intent. Perhaps an example would clarify. 
Say that we have three segments, A, B , C containing documents `A={0, 3, 6}; 
B={1, 4, 7}; C={2, 5, 8}`, where the documents are understood to have a single 
field with the value shown, and the index sort is ordered in the natural way.  
Without this change, if we merged A and B, we'd get a new segment `A+B={0, 1, 
3, 4, 6, 7}`. Now suppose there is no index sort (and the documents just 
"happen" to be in the index in the order given above, for the sake of the 
example), and we apply a `leafSorter` that sorts by the minimum value of any 
document in the segment (I guess it could be any sort of aggregate over the 
segment?), then we would get `A+B={0, 3, 6, 1, 4, 7}`. Now if we apply both 
sorts, we would get the same result as in the first case, right? I'm still 
unclear how the conflict arises.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] msokolov commented on a change in pull request #2231: LUCENE-9680 - Re-add IndexWriter::getFieldNames

2021-02-01 Thread GitBox


msokolov commented on a change in pull request #2231:
URL: https://github.com/apache/lucene-solr/pull/2231#discussion_r568105107



##
File path: lucene/core/src/test/org/apache/lucene/index/TestIndexWriter.java
##
@@ -4600,4 +4600,49 @@ public void testIndexWriterBlocksOnStall() throws 
IOException, InterruptedExcept
   }
 }
   }
+
+  public void testGetFieldNames() throws IOException {
+Directory dir = newDirectory();
+
+IndexWriter writer = new IndexWriter(dir, newIndexWriterConfig(new 
MockAnalyzer(random(;
+
+assertEquals(Set.of(), writer.getFieldNames());
+
+addDocWithField(writer, "f1");
+assertEquals(Set.of("f1"), writer.getFieldNames());
+
+// should be unmodifiable:
+final Set fieldSet = writer.getFieldNames();
+assertThrows(UnsupportedOperationException.class, () -> 
fieldSet.add("cannot modify"));
+assertThrows(UnsupportedOperationException.class, () -> 
fieldSet.remove("f1"));
+
+addDocWithField(writer, "f2");
+assertEquals(Set.of("f1", "f2"), writer.getFieldNames());

Review comment:
   Let's also assert that the original `fieldSet` has not been modified - 
it was a true copy and not some kind of alias over an underlying modifiable Set?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9718) REGEX Pattern Search, character classes with quantifiers do not work

2021-02-01 Thread Michael Sokolov (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276618#comment-17276618
 ] 

Michael Sokolov commented on LUCENE-9718:
-

I guess we come to expect PCRE in every implementation, but this is not that. 
By the way, not even Java is totally compatible with Perl I think. So it's not 
expected that numeric quantifiers in curly braces should work - this is not a 
PCRE implementation.

Further, the supported syntax is clearly documented in `RegExp`'s javadocs, and 
there is a pointer there from `RegExpQuery`:

{{ * The supported syntax is documented in the {@link RegExp} class. Note 
this might be different
 * than other regular expression implementations. For some alternatives with 
different syntax, look
 * under the sandbox.}}



Did you try raising the issue on one of the mailing lists before opening this 
issue? That's usually best.

> REGEX Pattern Search, character classes with quantifiers do not work
> 
>
> Key: LUCENE-9718
> URL: https://issues.apache.org/jira/browse/LUCENE-9718
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 7.7.3, 8.6.3
>Reporter: Brian Feldman
>Priority: Minor
>
> Character classes with a quantifier do not work, no error is given and no 
> results are returned. For example \d\{2} or \d\{2,3} as is commonly written 
> in most languages supporting regular expressions, simply and quietly does not 
> work.  A user work around is to write them fully out such as \d\d or 
> [0-9][0-9] or as [0-9]\{2,3} .
>  
> This inconsistency or limitation is not documented, wasting the time of users 
> as they have to figure this out themselves. I believe this inconsistency 
> should be clearly documented and an effort to fixing the inconsistency would 
> improve pattern searching.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mayya-sharipova commented on a change in pull request #2256: LUCENE-9507 Custom order for leaves in IndexReader and IndexWriter

2021-02-01 Thread GitBox


mayya-sharipova commented on a change in pull request #2256:
URL: https://github.com/apache/lucene-solr/pull/2256#discussion_r568095366



##
File path: lucene/core/src/java/org/apache/lucene/index/IndexWriter.java
##
@@ -933,6 +936,31 @@ protected final void ensureOpen() throws 
AlreadyClosedException {
* low-level IO error
*/
   public IndexWriter(Directory d, IndexWriterConfig conf) throws IOException {
+this(d, conf, null);
+  }
+
+  /**
+   * Constructs a new IndexWriter per the settings given in conf. 
If you want to make
+   * "live" changes to this writer instance, use {@link #getConfig()}.
+   *
+   * NOTE: after ths writer is created, the given configuration 
instance cannot be passed
+   * to another writer.
+   *
+   * @param d the index directory. The index is either created or appended 
according 
+   * conf.getOpenMode().
+   * @param conf the configuration settings according to which IndexWriter 
should be initialized.
+   * @param leafSorter a comparator for sorting leaf readers. Providing 
leafSorter is useful for
+   * indices on which it is expected to run many queries with particular 
sort criteria (e.g. for
+   * time-based indices this is usually a descending sort on timestamp). 
In this case {@code
+   * leafSorter} should sort leaves according to this sort criteria. 
Providing leafSorter allows
+   * to speed up this particular type of sort queries by early terminating 
while iterating
+   * though segments and segments' documents.
+   * @throws IOException if the directory cannot be read/written to, or if it 
does not exist and
+   * conf.getOpenMode() is OpenMode.APPEND or if 
there is any other
+   * low-level IO error
+   */
+  public IndexWriter(Directory d, IndexWriterConfig conf, 
Comparator leafSorter)

Review comment:
   Addressed in 7ddff67





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mayya-sharipova commented on a change in pull request #2256: LUCENE-9507 Custom order for leaves in IndexReader and IndexWriter

2021-02-01 Thread GitBox


mayya-sharipova commented on a change in pull request #2256:
URL: https://github.com/apache/lucene-solr/pull/2256#discussion_r568095174



##
File path: lucene/core/src/java/org/apache/lucene/index/IndexWriter.java
##
@@ -941,6 +969,11 @@ public IndexWriter(Directory d, IndexWriterConfig conf) 
throws IOException {
 // obtain the write.lock. If the user configured a timeout,
 // we wrap with a sleeper and this might take some time.
 writeLock = d.obtainLock(WRITE_LOCK_NAME);
+if (config.getIndexSort() != null && leafSorter != null) {
+  throw new IllegalArgumentException(
+  "[IndexWriter] can't use index sort and leaf sorter at the same 
time!");

Review comment:
   @msokolov Thank you for your feedback and explanation. Sorry, I am still 
not super clear about this point. It seems to me as  `indexSorter` maps each 
leaf's documents into the merged segment according to its sort,  the same way 
`leafSorter` will map each leaf's documents into the merged segment according 
to its sort (given several merging segments, in the merged segment the first 
docs should be docs from a segment with highest sort values..)
   
   But I am ok to remove this check in this PR, as this PR is not concerned 
with merging, and follow up on this point in the following PR. 
   
   Addressed in 7ddff6775c8





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] epugh commented on a change in pull request #2275: SOLR-15123: Make all Tool option descriptions follow the same general pattern.

2021-02-01 Thread GitBox


epugh commented on a change in pull request #2275:
URL: https://github.com/apache/lucene-solr/pull/2275#discussion_r568092337



##
File path: solr/core/src/java/org/apache/solr/util/ExportTool.java
##
@@ -216,32 +216,32 @@ void end() throws IOException {
   Option.builder("url")
   .hasArg()
   .required()
-  .desc("Address of the collection, example 
http://localhost:8983/solr/gettingstarted;)
+  .desc("Address of the collection, example 
http://localhost:8983/solr/gettingstarted.;)

Review comment:
   also, we do this a lot for the ZooKeeper host:  
   
   ```
   .desc("Address of the ZooKeeper ensemble; defaults to: " + ZK_HOST + '.')
   ```
   
   Maybe convert those to
   
   ```
   .desc("Address of the ZooKeeper ensemble; defaults to: '" + ZK_HOST + "'.")
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-9705) Move all codec formats to the o.a.l.codecs.Lucene90 package

2021-02-01 Thread Julie Tibshirani (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276599#comment-17276599
 ] 

Julie Tibshirani edited comment on LUCENE-9705 at 2/1/21, 7:44 PM:
---

{quote}It's especially clear here where we must copy a lot of classes with no 
change at all, merely to clearly and consistently document the index version 
change.
{quote}
I’ll try to add some context since I suspect there might be misunderstanding. 
In general when there is a new major version, we *do not* plan to create all 
new index format classes. We only copy a class and move it to backwards-codecs 
when there is a change to that specific format, for example {{PointsFormat}}. 
This proposal applies only to the 9.0 release, and its main purpose is to 
support the work in LUCENE-9047 to move all formats to little endian. My 
understanding is that moving to little endian impacts all the formats and will 
be much cleaner if we used these fresh {{Lucene90*Format}}.
{quote}I wonder if we (eventually) should consider shifting to a versioning 
system that doesn't require new classes. Is this somehow a feature of the 
service discovery API that we use?
{quote}
We indeed load codecs (with their formats) through a service discovery API. If 
a user wants to read indices from a previous version, they can depend on 
backwards-codecs so Lucene loads the correct older codec. As of LUCENE-9669, we 
allow reading indices back to major version N-2.

I personally really like the current "copy-on-write" system for formats. 
There’s code duplication, but it has advantages over combining different 
version logic in the same file:
 * It’s really clear how each version behaves. Having a direct copy like 
{{Lucene70Codec}} is almost as if we were pulling in the codec jars from Lucene 
7.0.
 * It decreases risk of introducing bugs or accidental changes. If you’re 
making an enhancement to a new format, there’s little chance of changing the 
logic for an old format (since it lives in a separate class). This is 
especially important since older formats are not tested as thoroughly.

I started to appreciate it after experiencing the alternative in Elasticsearch, 
where we’re constantly bumping into if/ else version checks when making changes.


was (Author: julietibs):
{quote}It's especially clear here where we must copy a lot of classes with no 
change at all, merely to clearly and consistently document the index version 
change.
{quote}
I’ll try to add some context since I suspect there might be misunderstanding. 
In general when there is a new major version, we *do not* plan to create all 
new index format classes. We only copy a class and move it to backwards-codecs 
when there is a change to that specific format, for example {{PointsFormat}}. 
This proposal applies only to the 9.0 release, and its main purpose is to 
support the work in LUCENE-9047 to move all formats to little endian. My 
understanding is that moving to little endian impacts all the formats and will 
be much cleaner if we used these fresh {{Lucene90*Format}}.
{quote}I wonder if we (eventually) should consider shifting to a versioning 
system that doesn't require new classes. Is this somehow a feature of the 
service discovery API that we use?
{quote}
We indeed load codecs (with their formats) through a service discovery API. If 
a user wants to read indices from a previous major version, they can depend on 
backwards-codecs so Lucene loads the correct older codec. As of LUCENE-9669, we 
allow reading indices back to version N-2.

I personally really like the current "copy-on-write" system for formats. 
There’s code duplication, but it has advantages over combining different 
version logic in the same file:
 * It’s really clear how each version behaves. Having a direct copy like 
{{Lucene70Codec}} is almost as if we were pulling in the codec jars from Lucene 
7.0.
 * It decreases risk of introducing bugs or accidental changes. If you’re 
making an enhancement to a new format, there’s little chance of changing the 
logic for an old format (since it lives in a separate class). This is 
especially important since older formats are not tested as thoroughly.

I started to appreciate it after experiencing the alternative in Elasticsearch, 
where we’re constantly bumping into if/ else version checks when making changes.

> Move all codec formats to the o.a.l.codecs.Lucene90 package
> ---
>
> Key: LUCENE-9705
> URL: https://issues.apache.org/jira/browse/LUCENE-9705
> Project: Lucene - Core
>  Issue Type: Wish
>Reporter: Ignacio Vera
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Current formats are distributed in different packages, prefixed with the 
> Lucene version they were created. With the upcoming release of 

[GitHub] [lucene-solr] epugh commented on a change in pull request #2275: SOLR-15123: Make all Tool option descriptions follow the same general pattern.

2021-02-01 Thread GitBox


epugh commented on a change in pull request #2275:
URL: https://github.com/apache/lucene-solr/pull/2275#discussion_r568092337



##
File path: solr/core/src/java/org/apache/solr/util/ExportTool.java
##
@@ -216,32 +216,32 @@ void end() throws IOException {
   Option.builder("url")
   .hasArg()
   .required()
-  .desc("Address of the collection, example 
http://localhost:8983/solr/gettingstarted;)
+  .desc("Address of the collection, example 
http://localhost:8983/solr/gettingstarted.;)

Review comment:
   also, we do this a lot for the ZooKeeper host:  
   
   ```
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] epugh commented on a change in pull request #2275: SOLR-15123: Make all Tool option descriptions follow the same general pattern.

2021-02-01 Thread GitBox


epugh commented on a change in pull request #2275:
URL: https://github.com/apache/lucene-solr/pull/2275#discussion_r568090848



##
File path: solr/core/src/java/org/apache/solr/util/SolrCLI.java
##
@@ -1795,25 +1795,25 @@ public ConfigSetUploadTool(PrintStream stdout) {
   .argName("confname") // Comes out in help message
   .hasArg() // Has one sub-argument
   .required(true) // confname argument must be present
-  .desc("Configset name on Zookeeper")
+  .desc("Configset name on Zookeeper.")

Review comment:
   "several" is right.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jtibshirani opened a new pull request #2287: Remove write logic from Lucene70NormsFormat.

2021-02-01 Thread GitBox


jtibshirani opened a new pull request #2287:
URL: https://github.com/apache/lucene-solr/pull/2287


   Our policy is to not maintain write logic for old formats that can't be 
written
   to. The write logic is moved to the test folder to support unit testing.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-9705) Move all codec formats to the o.a.l.codecs.Lucene90 package

2021-02-01 Thread Julie Tibshirani (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276599#comment-17276599
 ] 

Julie Tibshirani edited comment on LUCENE-9705 at 2/1/21, 7:37 PM:
---

{quote}It's especially clear here where we must copy a lot of classes with no 
change at all, merely to clearly and consistently document the index version 
change.
{quote}
I’ll try to add some context since I suspect there might be misunderstanding. 
In general when there is a new major version, we *do not* plan to create all 
new index format classes. We only copy a class and move it to backwards-codecs 
when there is a change to that specific format, for example {{PointsFormat}}. 
This proposal applies only to the 9.0 release, and its main purpose is to 
support the work in LUCENE-9047 to move all formats to little endian. My 
understanding is that moving to little endian impacts all the formats and will 
be much cleaner if we used these fresh {{Lucene90*Format}}.
{quote}I wonder if we (eventually) should consider shifting to a versioning 
system that doesn't require new classes. Is this somehow a feature of the 
service discovery API that we use?
{quote}
We indeed load codecs (with their formats) through a service discovery API. If 
a user wants to read indices from a previous major version, they can depend on 
backwards-codecs so Lucene loads the correct older codec. As of LUCENE-9669, we 
allow reading indices back to version N-2.

I personally really like the current "copy-on-write" system for formats. 
There’s code duplication, but it has advantages over combining different 
version logic in the same file:
 * It’s really clear how each version behaves. Having a direct copy like 
{{Lucene70Codec}} is almost as if we were pulling in the codec jars from Lucene 
7.0.
 * It decreases risk of introducing bugs or accidental changes. If you’re 
making an enhancement to a new format, there’s little chance of changing the 
logic for an old format (since it lives in a separate class). This is 
especially important since older formats are not tested as thoroughly.

I started to appreciate it after experiencing the alternative in Elasticsearch, 
where we’re constantly bumping into if/ else version checks when making changes.


was (Author: julietibs):
{quote}It's especially clear here where we must copy a lot of classes with no 
change at all, merely to clearly and consistently document the index version 
change.
{quote}
I’ll try to add some context since I suspect there might be misunderstanding. 
In general when there is a new major version, we *do not* plan to create all 
new index format classes. We only copy a class and move it to backwards-codecs 
when there is a change to that specific format, for example {{PointsFormat}}. 
This proposal applies only to the 9.0 release, and its main purpose is to 
support the work in LUCENE-9047 to move all formats to little endian. My 
understanding is that moving to little endian impacts all the formats and will 
be much cleaner if we used these fresh {{Lucene90*Format}}.
{quote}I wonder if we (eventually) should consider shifting to a versioning 
system that doesn't require new classes. Is this somehow a feature of the 
service discovery API that we use?
{quote}
We indeed load codecs (with their formats) through a service discovery API. If 
a user wants to read indices from a previous major version, they can depend on 
backwards-codecs so Lucene loads the correct older codec. As of LUCENE-9669, we 
allow reading indices back to version N-2.

I personally really like the current "copy-on-write" system for formats. 
There’s code duplication, but it has advantages over combining different 
version logic in the same file:
 * It’s really clear how each version behaves. Having a direct copy like 
\{{Lucene70Codec} is almost as if we were pulling in the codec jars from Lucene 
7.0.
 * It decreases risk of introducing bugs or accidental changes. If you’re 
making an enhancement to a new format, there’s little chance of changing the 
logic for an old format (since it lives in a separate class). This is 
especially important since older formats are not tested as thoroughly.

I started to appreciate it after experiencing the alternative in Elasticsearch, 
where we’re constantly bumping into if/ else version checks when making changes.

> Move all codec formats to the o.a.l.codecs.Lucene90 package
> ---
>
> Key: LUCENE-9705
> URL: https://issues.apache.org/jira/browse/LUCENE-9705
> Project: Lucene - Core
>  Issue Type: Wish
>Reporter: Ignacio Vera
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Current formats are distributed in different packages, prefixed with the 
> Lucene version they were created. With the upcoming release of 

[GitHub] [lucene-solr] epugh commented on a change in pull request #2275: SOLR-15123: Make all Tool option descriptions follow the same general pattern.

2021-02-01 Thread GitBox


epugh commented on a change in pull request #2275:
URL: https://github.com/apache/lucene-solr/pull/2275#discussion_r568087950



##
File path: solr/core/src/java/org/apache/solr/util/PackageTool.java
##
@@ -261,44 +261,44 @@ protected void runImpl(CommandLine cli) throws Exception {
 .argName("URL")
 .hasArg()
 .required(true)
-.desc("Address of the Solr Web application, defaults to: " + 
SolrCLI.DEFAULT_SOLR_URL)
+.desc("Address of the Solr Web application, defaults to: " + 
SolrCLI.DEFAULT_SOLR_URL + '.')
 .build(),
 
 Option.builder("collections")
 .argName("COLLECTIONS")
 .hasArg()
 .required(false)
-.desc("List of collections. Run './solr package help' for more 
details.")
+.desc("List of collections.")
 .build(),
 
 Option.builder("cluster")
 .required(false)
-.desc("Needed to install cluster level plugins in a package. Run 
'./solr package help' for more details.")
+.desc("Needed to install cluster level plugins in a package.")

Review comment:
   Take a look at what I put in now.Also, now I see we sometimes say 
"package" and other times "plugin", I wonder wha tthe right term is?  I *think* 
it's "package".





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9705) Move all codec formats to the o.a.l.codecs.Lucene90 package

2021-02-01 Thread Julie Tibshirani (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276599#comment-17276599
 ] 

Julie Tibshirani commented on LUCENE-9705:
--

{quote}It's especially clear here where we must copy a lot of classes with no 
change at all, merely to clearly and consistently document the index version 
change.
{quote}
I’ll try to add some context since I suspect there might be misunderstanding. 
In general when there is a new major version, we *do not* plan to create all 
new index format classes. We only copy a class and move it to backwards-codecs 
when there is a change to that specific format, for example {{PointsFormat}}. 
This proposal applies only to the 9.0 release, and its main purpose is to 
support the work in LUCENE-9047 to move all formats to little endian. My 
understanding is that moving to little endian impacts all the formats and will 
be much cleaner if we used these fresh {{Lucene90*Format}}.
{quote}I wonder if we (eventually) should consider shifting to a versioning 
system that doesn't require new classes. Is this somehow a feature of the 
service discovery API that we use?
{quote}
We indeed load codecs (with their formats) through a service discovery API. If 
a user wants to read indices from a previous major version, they can depend on 
backwards-codecs so Lucene loads the correct older codec. As of LUCENE-9669, we 
allow reading indices back to version N-2.

I personally really like the current "copy-on-write" system for formats. 
There’s code duplication, but it has advantages over combining different 
version logic in the same file:
 * It’s really clear how each version behaves. Having a direct copy like 
\{{Lucene70Codec} is almost as if we were pulling in the codec jars from Lucene 
7.0.
 * It decreases risk of introducing bugs or accidental changes. If you’re 
making an enhancement to a new format, there’s little chance of changing the 
logic for an old format (since it lives in a separate class). This is 
especially important since older formats are not tested as thoroughly.

I started to appreciate it after experiencing the alternative in Elasticsearch, 
where we’re constantly bumping into if/ else version checks when making changes.

> Move all codec formats to the o.a.l.codecs.Lucene90 package
> ---
>
> Key: LUCENE-9705
> URL: https://issues.apache.org/jira/browse/LUCENE-9705
> Project: Lucene - Core
>  Issue Type: Wish
>Reporter: Ignacio Vera
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Current formats are distributed in different packages, prefixed with the 
> Lucene version they were created. With the upcoming release of Lucene 9.0, it 
> would be nice to move all those formats to just the o.a.l.codecs.Lucene90 
> package (and of course moving the current ones to the backwards-codecs).
> This issue would actually facilitate moving the directory API to little 
> endian (LUCENE-9047) as the only codecs that would need to handle backwards 
> compatibility will be the codecs in backwards codecs.
> In addition, it can help formalising the use of internal versions vs format 
> versioning ( LUCENE-9616)
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] epugh commented on a change in pull request #2275: SOLR-15123: Make all Tool option descriptions follow the same general pattern.

2021-02-01 Thread GitBox


epugh commented on a change in pull request #2275:
URL: https://github.com/apache/lucene-solr/pull/2275#discussion_r568087125



##
File path: solr/core/src/java/org/apache/solr/util/ExportTool.java
##
@@ -216,32 +216,32 @@ void end() throws IOException {
   Option.builder("url")
   .hasArg()
   .required()
-  .desc("Address of the collection, example 
http://localhost:8983/solr/gettingstarted;)
+  .desc("Address of the collection, example 
http://localhost:8983/solr/gettingstarted.;)

Review comment:
   what would you say to wrapping it in single quotes?  Some other places 
that appears to be the pattern.   
   
   Address of the collection, example 
'http://localhost:8983/solr/gettingstarted'.
   
   Or maybe, I'm just being pedantic about my periods and should skip it!





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] madrob commented on a change in pull request #2284: SOLR-11233: Add optional JAVA8_GC_LOG_FILE_OPTS for bin/solr.

2021-02-01 Thread GitBox


madrob commented on a change in pull request #2284:
URL: https://github.com/apache/lucene-solr/pull/2284#discussion_r568085426



##
File path: solr/bin/solr
##
@@ -2026,7 +2026,11 @@ if [ "$GC_LOG_OPTS" != "" ]; then
 if [ "$JAVA_VENDOR" == "IBM J9" ]; then
   gc_log_flag="-Xverbosegclog"
 fi
-GC_LOG_OPTS+=("$gc_log_flag:$SOLR_LOGS_DIR/solr_gc.log" 
'-XX:+UseGCLogFileRotation' '-XX:NumberOfGCLogFiles=9' '-XX:GCLogFileSize=20M')
+if [ -z ${JAVA8_GC_LOG_FILE_OPTS+x} ]; then

Review comment:
   should be `:+`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (SOLR-14253) Remove Sleeps from OverseerCollectionMessageHandler

2021-02-01 Thread Mike Drob (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob resolved SOLR-14253.
--
Fix Version/s: master (9.0)
 Assignee: Mike Drob
   Resolution: Fixed

> Remove Sleeps from OverseerCollectionMessageHandler
> ---
>
> Key: SOLR-14253
> URL: https://issues.apache.org/jira/browse/SOLR-14253
> Project: Solr
>  Issue Type: Bug
>  Components: Server
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Major
> Fix For: master (9.0)
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> From the conversations with Mark Miller a few months back - there are a lot 
> of places in the server code where we have hard sleeps instead of relying on 
> notifications and watchers to handle state.
> I will begin to tackle these one at a time, starting with 
> OverseerCollectionMessageHandler.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14253) Remove Sleeps from OverseerCollectionMessageHandler

2021-02-01 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276594#comment-17276594
 ] 

ASF subversion and git services commented on SOLR-14253:


Commit 99748384cfb16cdef2c5a116243cddc23cedf11c in lucene-solr's branch 
refs/heads/master from Mike Drob
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=9974838 ]

SOLR-14253 Replace sleep calls with ZK waits (#1297)

Co-Authored-By: markrmiller 

> Remove Sleeps from OverseerCollectionMessageHandler
> ---
>
> Key: SOLR-14253
> URL: https://issues.apache.org/jira/browse/SOLR-14253
> Project: Solr
>  Issue Type: Bug
>  Components: Server
>Reporter: Mike Drob
>Priority: Major
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> From the conversations with Mark Miller a few months back - there are a lot 
> of places in the server code where we have hard sleeps instead of relying on 
> notifications and watchers to handle state.
> I will begin to tackle these one at a time, starting with 
> OverseerCollectionMessageHandler.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] madrob merged pull request #1297: SOLR-14253 Replace various sleep calls with ZK waits

2021-02-01 Thread GitBox


madrob merged pull request #1297:
URL: https://github.com/apache/lucene-solr/pull/1297


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] msokolov commented on a change in pull request #2282: LUCENE-9615: Expose HnswGraphBuilder index-time hyperparameters as FieldType attributes

2021-02-01 Thread GitBox


msokolov commented on a change in pull request #2282:
URL: https://github.com/apache/lucene-solr/pull/2282#discussion_r568075607



##
File path: 
lucene/core/src/java/org/apache/lucene/codecs/lucene90/Lucene90VectorWriter.java
##
@@ -188,9 +190,29 @@ private void writeGraph(
   RandomAccessVectorValuesProducer vectorValues,
   long graphDataOffset,
   long[] offsets,
-  int count)
+  int count,
+  String maxConnStr,
+  String beamWidthStr)
   throws IOException {
-HnswGraphBuilder hnswGraphBuilder = new HnswGraphBuilder(vectorValues);
+int maxConn, beamWidth;
+if (maxConnStr == null) {
+  maxConn = HnswGraphBuilder.DEFAULT_MAX_CONN;
+} else if (!maxConnStr.matches("[0-9]+")) {

Review comment:
   I don't think we need this - we can allow parseInt to throw an 
exception. Let's catch `NumberFormatException` and rethrow with more context 
(which attribute caused the exception). Also HnswGraphBuilder tests for `<= 0`, 
so we don't need to check that here.

##
File path: lucene/core/src/java/org/apache/lucene/document/VectorField.java
##
@@ -53,6 +54,44 @@ private static FieldType getType(float[] v, 
VectorValues.SearchStrategy searchSt
 return type;
   }
 
+  /**
+   * Public method to create HNSW field type with the given max-connections 
and beam-width
+   * parameters that would be used by HnswGraphBuilder while constructing HNSW 
graph.
+   *
+   * @param dimension dimension of vectors
+   * @param searchStrategy a function defining vector proximity.
+   * @param maxConn max-connections at each HNSW graph node
+   * @param beamWidth size of list to be used while constructing HNSW graph
+   * @throws IllegalArgumentException if any parameter is null, or has 
dimension  1024.
+   */
+  public static FieldType createHnswType(
+  int dimension, VectorValues.SearchStrategy searchStrategy, int maxConn, 
int beamWidth) {
+if (dimension == 0) {
+  throw new IllegalArgumentException("cannot index an empty vector");
+}
+if (dimension > VectorValues.MAX_DIMENSIONS) {
+  throw new IllegalArgumentException(
+  "cannot index vectors with dimension greater than " + 
VectorValues.MAX_DIMENSIONS);
+}
+if (searchStrategy == null) {
+  throw new IllegalArgumentException("search strategy must not be null");

Review comment:
   Let's also assert `searchStrategy.isHnsw()` to catch attempts to use 
`NONE` or some other unsupported future strategy.

##
File path: lucene/core/src/test/org/apache/lucene/util/hnsw/KnnGraphTester.java
##
@@ -132,13 +135,13 @@ private void run(String... args) throws Exception {
   if (iarg == args.length - 1) {
 throw new IllegalArgumentException("-beamWidthIndex requires a 
following number");
   }
-  HnswGraphBuilder.DEFAULT_BEAM_WIDTH = Integer.parseInt(args[++iarg]);

Review comment:
   With this change, we no longer have any need to make these static 
variables writable - let's change them to `final` in `HnswGraphBuilder`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] muse-dev[bot] commented on a change in pull request #2285: SOLR-14928: introduce distributed cluster state updates

2021-02-01 Thread GitBox


muse-dev[bot] commented on a change in pull request #2285:
URL: https://github.com/apache/lucene-solr/pull/2285#discussion_r568077584



##
File path: solr/core/src/java/org/apache/solr/cloud/Overseer.java
##
@@ -1058,10 +1056,29 @@ public ZkStateReader getZkStateReader() {
   }
 
   public void offerStateUpdate(byte[] data) throws KeeperException, 
InterruptedException {
+// When cluster state change is distributed, the Overseer cluster state 
update queue should only ever receive only QUIT messages.
+// These go to sendQuitToOverseer for execution path clarity.
+if (distributedClusterChangeUpdater.isDistributedStateChange()) {
+  final ZkNodeProps message = ZkNodeProps.load(data);

Review comment:
   *THREAD_SAFETY_VIOLATION:*  Unprotected write. Non-private method 
`Overseer.offerStateUpdate(...)` indirectly writes to field 
`noggit.JSONParser.devNull.buf` outside of synchronization.
Reporting because another access to the same memory occurs on a background 
thread, although this access may not.

##
File path: solr/core/src/java/org/apache/solr/cloud/ZkController.java
##
@@ -463,7 +468,11 @@ public boolean isClosed() {
 
 init();

Review comment:
   *THREAD_SAFETY_VIOLATION:*  Read/Write race. Non-private method 
`ZkController(...)` indirectly reads with synchronization from 
`noggit.JSONParser.devNull.buf`. Potentially races with unsynchronized write in 
method `ZkController.preClose()`.
Reporting because another access to the same memory occurs on a background 
thread, although this access may not.

##
File path: solr/core/src/java/org/apache/solr/cloud/ZkController.java
##
@@ -2660,17 +2679,26 @@ public boolean 
checkIfCoreNodeNameAlreadyExists(CoreDescriptor dcore) {
*/
   public void publishNodeAsDown(String nodeName) {
 log.info("Publish node={} as DOWN", nodeName);
-ZkNodeProps m = new ZkNodeProps(Overseer.QUEUE_OPERATION, 
OverseerAction.DOWNNODE.toLower(),
-ZkStateReader.NODE_NAME_PROP, nodeName);
-try {
-  overseer.getStateUpdateQueue().offer(Utils.toJSON(m));
-} catch (AlreadyClosedException e) {
-  log.info("Not publishing node as DOWN because a resource required to do 
so is already closed.");
-} catch (InterruptedException e) {
-  Thread.currentThread().interrupt();
-  log.debug("Publish node as down was interrupted.");
-} catch (KeeperException e) {
-  log.warn("Could not publish node as down: ", e);
+if (distributedClusterChangeUpdater.isDistributedStateChange()) {
+  // Note that with the current implementation, when distributed cluster 
state updates are enabled, we mark the node
+  // down synchronously from this thread, whereas the Overseer cluster 
state update frees this thread right away and
+  // the Overseer will async mark the node down but updating all affected 
collections.
+  // If this is an issue (i.e. takes too long), then the call below should 
be executed from another thread so that
+  // the calling thread can immediately return.
+  distributedClusterChangeUpdater.executeNodeDownStateChange(nodeName, 
zkStateReader);

Review comment:
   *THREAD_SAFETY_VIOLATION:*  Unprotected write. Non-private method 
`ZkController.publishNodeAsDown(...)` indirectly writes to field 
`noggit.JSONParser.devNull.buf` outside of synchronization.
Reporting because another access to the same memory occurs on a background 
thread, although this access may not.

##
File path: solr/core/src/java/org/apache/solr/cloud/ZkController.java
##
@@ -1662,22 +1680,20 @@ public void unregister(String coreName, CoreDescriptor 
cd, boolean removeCoreFro
 }
 CloudDescriptor cloudDescriptor = cd.getCloudDescriptor();
 if (removeCoreFromZk) {
-  ZkNodeProps m = new ZkNodeProps(Overseer.QUEUE_OPERATION,
-  OverseerAction.DELETECORE.toLower(), ZkStateReader.CORE_NAME_PROP, 
coreName,
+  ZkNodeProps m = new ZkNodeProps(Overseer.QUEUE_OPERATION, 
OverseerAction.DELETECORE.toLower(),
+  ZkStateReader.CORE_NAME_PROP, coreName,
   ZkStateReader.NODE_NAME_PROP, getNodeName(),
   ZkStateReader.COLLECTION_PROP, cloudDescriptor.getCollectionName(),
   ZkStateReader.CORE_NODE_NAME_PROP, coreNodeName);
-  overseerJobQueue.offer(Utils.toJSON(m));
+  if (distributedClusterChangeUpdater.isDistributedStateChange()) {
+
distributedClusterChangeUpdater.doSingleStateUpdate(DistributedClusterChangeUpdater.MutatingCommand.SliceRemoveReplica,
 m,
+getSolrCloudManager(), zkStateReader);

Review comment:
   *THREAD_SAFETY_VIOLATION:*  Read/Write race. Non-private method 
`ZkController.unregister(...)` indirectly reads without synchronization from 
`this.cloudManager`. Potentially races with write in method 
`ZkController.getSolrCloudManager()`.
Reporting because another access to the same memory occurs on a background 
thread, although this access 

[GitHub] [lucene-solr-operator] HoustonPutman commented on a change in pull request #151: Integrate with cert-manager to issue TLS certs for Solr

2021-02-01 Thread GitBox


HoustonPutman commented on a change in pull request #151:
URL: 
https://github.com/apache/lucene-solr-operator/pull/151#discussion_r568020961



##
File path: api/v1beta1/solrcloud_types.go
##
@@ -758,8 +770,12 @@ func (sc *SolrCloud) CommonServiceName() string {
 }
 
 // InternalURLForCloud returns the name of the common service for the cloud
-func InternalURLForCloud(cloudName string, namespace string) string {
-   return fmt.Sprintf("http://%s-solrcloud-common.%s;, cloudName, 
namespace)
+func InternalURLForCloud(sc *SolrCloud) string {
+   urlScheme := "http"

Review comment:
   this should probably use `sc.urlScheme()`

##
File path: controllers/solrcloud_controller.go
##
@@ -261,12 +268,77 @@ func (r *SolrCloudReconciler) Reconcile(req ctrl.Request) 
(ctrl.Result, error) {
blockReconciliationOfStatefulSet = true
}
 
+   tlsCertMd5 := ""
+   needsPkcs12InitContainer := false // flag if the StatefulSet needs an 
additional initCont to create PKCS12 keystore
+   // don't start reconciling TLS until we have ZK connectivity, avoids 
TLS code having to check for ZK
+   if !blockReconciliationOfStatefulSet && instance.Spec.SolrTLS != nil {
+   ctx := context.TODO()
+   // Create the autogenerated TLS Cert and wait for it to be 
issued
+   if instance.Spec.SolrTLS.AutoCreate != nil {
+   tlsReady, err := r.reconcileAutoCreateTLS(ctx, instance)
+   // don't create the StatefulSet until we have a cert, 
which can take a while for a Let's Encrypt Issuer
+   if !tlsReady || err != nil {
+   if err != nil {
+   r.Log.Error(err, "Reconcile TLS 
Certificate failed")
+   } else {
+   wait := 30 * time.Second
+   if 
instance.Spec.SolrTLS.AutoCreate.IssuerRef == nil {
+   // this is a self-signed cert, 
so no need to wait very long for it to issue
+   wait = 2 * time.Second
+   }
+   requeueOrNot.RequeueAfter = wait
+   }
+   return requeueOrNot, err

Review comment:
   Instead of returning here, you can use 
`blockReconciliationOfStatefulSet=true`, to make sure that only the creation of 
the statefulSet is blocked until the Certs are issued.

##
File path: main.go
##
@@ -65,6 +69,7 @@ func init() {
 
_ = solrv1beta1.AddToScheme(scheme)
_ = zkv1beta1.AddToScheme(scheme)
+   _ = certv1.AddToScheme(scheme)
 
// +kubebuilder:scaffold:scheme
flag.BoolVar(, "zk-operator", true, "The operator will 
not use the zk operator & crd when this flag is set to false.")

Review comment:
   Do we need a flag for the Cert Manager CRDs? Or are we going to assume 
that all users have these installed in their cluster?

##
File path: controllers/solrcloud_controller.go
##
@@ -422,9 +494,9 @@ func reconcileCloudStatus(r *SolrCloudReconciler, solrCloud 
*solr.SolrCloud, new
nodeStatus := solr.SolrNodeStatus{}
nodeStatus.Name = p.Name
nodeStatus.NodeName = p.Spec.NodeName
-   nodeStatus.InternalAddress = "http://; + 
solrCloud.InternalNodeUrl(nodeStatus.Name, true)

Review comment:
   maybe these methods should just include an option to `includeScheme`, 
just like `includePort`. But we could do this separately.

##
File path: controllers/solrcloud_controller.go
##
@@ -716,7 +785,10 @@ func (r *SolrCloudReconciler) 
SetupWithManagerAndReconciler(mgr ctrl.Manager, re
Owns({}).
Owns({}).
Owns({}).
-   Owns({})
+   Owns({}).
+   Owns({}).

Review comment:
   Did the deployment get left over from a merge with master?
   
   Also do any of the Cert manager things need to be here?

##
File path: controllers/solrcloud_controller.go
##
@@ -772,3 +848,188 @@ func (r *SolrCloudReconciler) 
indexAndWatchForProvidedConfigMaps(mgr ctrl.Manage
},

builder.WithPredicates(predicate.ResourceVersionChangedPredicate{})), nil
 }
+
+// Reconciles the TLS cert, returns either a bool to indicate if the cert is 
ready or an error
+func (r *SolrCloudReconciler) reconcileAutoCreateTLS(ctx context.Context, 
instance *solr.SolrCloud) (bool, error) {
+
+   // short circuit this method with a quick check if the cert exists and 
is ready
+   // this is useful b/c it may take many minutes for a cert to be issued, 
so we avoid
+   // all the other checking that happens below while we're waiting for 
the cert
+   

[jira] [Created] (LUCENE-9721) Hunspell: disallow ONLYINCOMPOUND suffixes at the very end of compound words

2021-02-01 Thread Peter Gromov (Jira)
Peter Gromov created LUCENE-9721:


 Summary: Hunspell: disallow ONLYINCOMPOUND suffixes at the very 
end of compound words
 Key: LUCENE-9721
 URL: https://issues.apache.org/jira/browse/LUCENE-9721
 Project: Lucene - Core
  Issue Type: Sub-task
Reporter: Peter Gromov






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] donnerpeter opened a new pull request #2286: LUCENE-9720: Hunspell: more ways to vary misspelled word variations f…

2021-02-01 Thread GitBox


donnerpeter opened a new pull request #2286:
URL: https://github.com/apache/lucene-solr/pull/2286


   …or suggestions
   
   
   
   
   # Description
   
   Work on Hunspell suggestions in progress.
   
   # Solution
   
   Reimplement most of Hunspell's `suggest` logic, without ngrams so far.
   
   # Tests
   
   Several tests from Hunspell repo.
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [x] I have reviewed the guidelines for [How to 
Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms 
to the standards described there to the best of my ability.
   - [x] I have created a Jira issue and added the issue ID to my pull request 
title.
   - [x] I have given Solr maintainers 
[access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork)
 to contribute to my PR branch. (optional but recommended)
   - [x] I have developed this patch against the `master` branch.
   - [x] I have run `./gradlew check`.
   - [x] I have added tests for my changes.
   - [ ] I have added documentation for the [Ref 
Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) 
(for Solr changes only).
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-9720) Hunspell: more ways to vary misspelled word variations for suggestions

2021-02-01 Thread Peter Gromov (Jira)
Peter Gromov created LUCENE-9720:


 Summary: Hunspell: more ways to vary misspelled word variations 
for suggestions
 Key: LUCENE-9720
 URL: https://issues.apache.org/jira/browse/LUCENE-9720
 Project: Lucene - Core
  Issue Type: Sub-task
Reporter: Peter Gromov






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] NazerkeBS commented on a change in pull request #2230: SOLR-15011: /admin/logging handler is configured logs to all nodes

2021-02-01 Thread GitBox


NazerkeBS commented on a change in pull request #2230:
URL: https://github.com/apache/lucene-solr/pull/2230#discussion_r568061888



##
File path: solr/webapp/web/js/angular/services.js
##
@@ -58,10 +58,10 @@ solrAdminServices.factory('System',
   }])
 .factory('Logging',
   ['$resource', function($resource) {
-return $resource('admin/info/logging', {'wt':'json', 'nodes': 'all', 
'_':Date.now()}, {
+return $resource('admin/info/logging', {'wt':'json', '_':Date.now()}, {
   "events": {params: {since:'0'}},
   "levels": {},
-  "setLevel": {}
+  "setLevel": {'nodes': 'all'}

Review comment:
   I debugged it on Admin UI and it works without `params`. But I updated 
my PR to use `params` as other handlers also use `params` in this file .





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] msokolov commented on a change in pull request #2256: LUCENE-9507 Custom order for leaves in IndexReader and IndexWriter

2021-02-01 Thread GitBox


msokolov commented on a change in pull request #2256:
URL: https://github.com/apache/lucene-solr/pull/2256#discussion_r568058172



##
File path: lucene/core/src/java/org/apache/lucene/index/IndexWriter.java
##
@@ -941,6 +969,11 @@ public IndexWriter(Directory d, IndexWriterConfig conf) 
throws IOException {
 // obtain the write.lock. If the user configured a timeout,
 // we wrap with a sleeper and this might take some time.
 writeLock = d.obtainLock(WRITE_LOCK_NAME);
+if (config.getIndexSort() != null && leafSorter != null) {
+  throw new IllegalArgumentException(
+  "[IndexWriter] can't use index sort and leaf sorter at the same 
time!");

Review comment:
   I think it's OK - that MultiSorter is sorting (by index sort) documents 
in several segments being merged; it will control the order of the documents 
within the new segment, but shouldn't have any influence on or conflict with 
the order of the segments being merged





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9705) Move all codec formats to the o.a.l.codecs.Lucene90 package

2021-02-01 Thread Michael Sokolov (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276575#comment-17276575
 ] 

Michael Sokolov commented on LUCENE-9705:
-

Just throwing this out there; I have no real proposal, just a feeling, but it 
seems very heavyweight that we create a new package and new java classes every 
time we change our index format. It's especially clear here where we must copy 
a lot of classes with no change at all, merely to clearly and consistently 
document the index version change. I noticed that we also have to copy (and 
slightly change) the package-level javadocs when we do this, and this has been 
done pretty inconsistently over time.

I wonder if we (eventually) should consider shifting to a versioning system 
that doesn't require new classes. Is this somehow a feature of the service 
discovery API that we use?

> Move all codec formats to the o.a.l.codecs.Lucene90 package
> ---
>
> Key: LUCENE-9705
> URL: https://issues.apache.org/jira/browse/LUCENE-9705
> Project: Lucene - Core
>  Issue Type: Wish
>Reporter: Ignacio Vera
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Current formats are distributed in different packages, prefixed with the 
> Lucene version they were created. With the upcoming release of Lucene 9.0, it 
> would be nice to move all those formats to just the o.a.l.codecs.Lucene90 
> package (and of course moving the current ones to the backwards-codecs).
> This issue would actually facilitate moving the directory API to little 
> endian (LUCENE-9047) as the only codecs that would need to handle backwards 
> compatibility will be the codecs in backwards codecs.
> In addition, it can help formalising the use of internal versions vs format 
> versioning ( LUCENE-9616)
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (SOLR-15115) Remove unused methods from TestRerankBase

2021-02-01 Thread Christine Poerschke (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-15115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christine Poerschke resolved SOLR-15115.

Fix Version/s: 8.9
   master (9.0)
   Resolution: Fixed

Thanks [~asalamon74]!

> Remove unused methods from TestRerankBase
> -
>
> Key: SOLR-15115
> URL: https://issues.apache.org/jira/browse/SOLR-15115
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Andras Salamon
>Assignee: Christine Poerschke
>Priority: Trivial
> Fix For: master (9.0), 8.9
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> There are two unused methods in TestRerankBase: buildIndexUsingAdoc, 
> loadModelAndFeatures which can be removed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-15115) Remove unused methods from TestRerankBase

2021-02-01 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-15115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276565#comment-17276565
 ] 

ASF subversion and git services commented on SOLR-15115:


Commit 95c3a3f87f653b1ba0f22d9e1b662e649845ad2a in lucene-solr's branch 
refs/heads/branch_8x from András Salamon
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=95c3a3f ]

SOLR-15115: Remove unused methods from TestRerankBase (#2261)



> Remove unused methods from TestRerankBase
> -
>
> Key: SOLR-15115
> URL: https://issues.apache.org/jira/browse/SOLR-15115
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Andras Salamon
>Assignee: Christine Poerschke
>Priority: Trivial
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> There are two unused methods in TestRerankBase: buildIndexUsingAdoc, 
> loadModelAndFeatures which can be removed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14920) Format code automatically and enforce it in Solr

2021-02-01 Thread Jason Gerlowski (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276559#comment-17276559
 ] 

Jason Gerlowski commented on SOLR-14920:


bq. I'm reluctant to do this to the Solr code base until ... we have some 
Solr-specific consensus

LUCENE-9564 where this was originally proposed had a healthy number of Solr 
guys (AB, Erick, David, Varun, Houston) voicing support.  (You can add my name 
to that list, though this is my first time chiming in.)  And neither 
LUCENE-9564 nor this ticket have brought out any detractors after 3-4 months.  
It seems safe to say we've reached [lazy] consensus on this?

bq. we have some clue what this means for the reference impl.

The concern about {{tidy}} making life harder for the reference_impl branch (or 
any other large feature branch) is well taken.  How much should we avoid 
changes on master out of fear for the merge conflicts they'll cause on the 
reference_impl branch?  I'm of two mind's here: I don't want to make anyone's 
development difficult, but I also worry generally about the prudence of 
skipping improvements to master because of ref_impl conflicts.  Leaving 
improvements off of master looks a bit like throwing away the bird in the hand 
in favor of the two in the bush.  But I don't have strong convictions there.  
In any case, I agree we need some consensus on that point particularly before 
inconveniencing the ref_impl branch.

But it's worth pointing out there's tons of code that the reference_impl 
_doesn't_ touch that we could start with.  There's a lot of safe places to 
start with this that don't require a consensus on the "ref_impl" discussion.

> Format code automatically and enforce it in Solr
> 
>
> Key: SOLR-14920
> URL: https://issues.apache.org/jira/browse/SOLR-14920
> Project: Solr
>  Issue Type: Improvement
>Reporter: Erick Erickson
>Priority: Major
>  Labels: codestyle, formatting
>
> See the discussion at: LUCENE-9564.
> This is a placeholder for the present, I'm reluctant to do this to the Solr 
> code base until after:
>  * we have some Solr-specific consensus
>  * we have some clue what this means for the reference impl.
> Reconciling the reference impl will be difficult enough without a zillion 
> format changes to add to the confusion.
> So my proposal is
> 1> do this.
> 2> Postpone this until after the reference impl is merged.
> 3> do this in one single commit for reasons like being able to conveniently 
> have this separated out from git blame.
> Assigning to myself so it doesn't get lost, but anyone who wants to take it 
> over please feel free.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] gus-asf commented on pull request #2185: LUCENE-9659 inequality support in payload check query

2021-02-01 Thread GitBox


gus-asf commented on pull request #2185:
URL: https://github.com/apache/lucene-solr/pull/2185#issuecomment-771044438


   > BTW I saw your relationship Gus but sometimes I enjoy reviewing some code 
too :-).
   
   :) always very welcome of course. I'm slightly irritated with myself for not 
wrangling the tools correctly to get the notifications, but more review is 
always excellent.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-15125) Link to docs is brroken

2021-02-01 Thread Gus Heck (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-15125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276537#comment-17276537
 ] 

Gus Heck commented on SOLR-15125:
-

There has been some difficulty with deploying the docs for the recent release, 
several of the latest versions are presently not available on the web, this is 
being worked on urgently by several folks. 

> Link to docs is brroken
> ---
>
> Key: SOLR-15125
> URL: https://issues.apache.org/jira/browse/SOLR-15125
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: website
>Reporter: Thomas Güttler
>Priority: Minor
>
> [On this page: 
> https://lucene.apache.org/solr/guide/|https://lucene.apache.org/solr/guide/]
> the link to [https://lucene.apache.org/solr/guide/8_8/]
> is broken.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jpountz commented on pull request #2186: LUCENE-9334 Consistency of field data structures

2021-02-01 Thread GitBox


jpountz commented on pull request #2186:
URL: https://github.com/apache/lucene-solr/pull/2186#issuecomment-771029358


   @mayya-sharipova Yes, if the field has been added previously without doc 
values, then it should be illegal to add it later with doc values indeed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] murblanc opened a new pull request #2285: SOLR-14928: introduce distributed cluster state updates

2021-02-01 Thread GitBox


murblanc opened a new pull request #2285:
URL: https://github.com/apache/lucene-solr/pull/2285


   SOLR-14928: introduce distributed cluster state updates (i.e. not having to 
go through a ZK queue and Overseer)
   
   The motivation behind this PR is to simplify SolrCloud and be the first step 
to (eventually) a significant scale increase: handle orders of magnitude more 
collections than currently and manage cluster state caching in a more flexible 
way.
   
   Changes introduced by this PR:
   
   - Support of distributed state updates for collections through Compare and 
Swap for updating state.json Zookeeper files.
   - Cluster wide configuration in `solr.xml` to pick the way cluster state 
updates are handled (default remains Overseer based updates)
   - Randomization based on test seed of the cluster update strategy 
(Distributed vs Overseer) in tests using `MiniSolrCloudCluster`, so both 
execution paths get continuously tested.
   
   A good entry point to explore this PR is class 
`DistributedClusterChangeUpdater`.
   
   Performance wise, small (low replica count) collection creation is faster 
with the distributed strategy than with Overseer, and significantly faster when 
multiple collections are created in parallel (10 parallel threads continuously 
creating 4 replicas collections show a 30% reduction in collection creation 
time on my laptop). High replica count (~40 and more) collection creation is 
slower. This will be addressed by using Per Replica States that limit 
contention on access to a single state.json file and/or a future splitting of 
state.json into finer grain elements.
   
   Next step after this one is distributing the execution of the Collection API 
commands to all nodes.
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-15115) Remove unused methods from TestRerankBase

2021-02-01 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-15115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276516#comment-17276516
 ] 

ASF subversion and git services commented on SOLR-15115:


Commit e8bc758144fa0cd77f817f57a8ae20f12868d845 in lucene-solr's branch 
refs/heads/master from András Salamon
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=e8bc758 ]

SOLR-15115: Remove unused methods from TestRerankBase (#2261)



> Remove unused methods from TestRerankBase
> -
>
> Key: SOLR-15115
> URL: https://issues.apache.org/jira/browse/SOLR-15115
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Andras Salamon
>Priority: Trivial
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> There are two unused methods in TestRerankBase: buildIndexUsingAdoc, 
> loadModelAndFeatures which can be removed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Assigned] (SOLR-15115) Remove unused methods from TestRerankBase

2021-02-01 Thread Christine Poerschke (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-15115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christine Poerschke reassigned SOLR-15115:
--

Assignee: Christine Poerschke

> Remove unused methods from TestRerankBase
> -
>
> Key: SOLR-15115
> URL: https://issues.apache.org/jira/browse/SOLR-15115
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Andras Salamon
>Assignee: Christine Poerschke
>Priority: Trivial
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> There are two unused methods in TestRerankBase: buildIndexUsingAdoc, 
> loadModelAndFeatures which can be removed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] cpoerschke merged pull request #2261: SOLR-15115: Remove unused methods from TestRerankBase

2021-02-01 Thread GitBox


cpoerschke merged pull request #2261:
URL: https://github.com/apache/lucene-solr/pull/2261


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] janhoy commented on a change in pull request #2230: SOLR-15011: /admin/logging handler is configured logs to all nodes

2021-02-01 Thread GitBox


janhoy commented on a change in pull request #2230:
URL: https://github.com/apache/lucene-solr/pull/2230#discussion_r568007698



##
File path: solr/webapp/web/js/angular/services.js
##
@@ -58,10 +58,10 @@ solrAdminServices.factory('System',
   }])
 .factory('Logging',
   ['$resource', function($resource) {
-return $resource('admin/info/logging', {'wt':'json', 'nodes': 'all', 
'_':Date.now()}, {
+return $resource('admin/info/logging', {'wt':'json', '_':Date.now()}, {
   "events": {params: {since:'0'}},
   "levels": {},
-  "setLevel": {}
+  "setLevel": {'nodes': 'all'}

Review comment:
   It’s Angular 1. Looks right but I’m not sure if there should be a 
`params: ` there as well. Testing will tell..





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-11233) GC_LOG_OPTS customisation is a little confusing

2021-02-01 Thread Christine Poerschke (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-11233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276513#comment-17276513
 ] 

Christine Poerschke commented on SOLR-11233:


{quote}... a new {{GC_LOG_FILE_OPTS}} option ...
{quote}
[https://github.com/apache/lucene-solr/pull/2284] proposes to name the new 
option {{JAVA8_GC_LOG_FILE_OPTS}} to make it clear that it's Java8 specific 
only (and the option would go away in Solr 9 which does not support Java8).

Help with making and testing the equivalent Windows i.e. {{solr.cmd}} changes 
would be appreciated.

Here's what I used for local testing:
{code:java}
export JAVA_HOME=$JAVA8_HOME

ant clean

cd solr ; ant server

export JAVA8_GC_LOG_FILE_OPTS='-Xloggc:./solr_gc.log -XX:+UseGCLogFileRotation 
-XX:NumberOfGCLogFiles=9 -XX:GCLogFileSize=10K'

bin/solr start -e techproducts

ps -ef | grep java | less

ls -ltrh `find . -name "*solr_gc.log*"`

bin/post -c techproducts ../solr/example/exampledocs

ls -ltrh `find . -name "*solr_gc.log*"`

bin/solr stop
{code}

> GC_LOG_OPTS customisation is a little confusing
> ---
>
> Key: SOLR-11233
> URL: https://issues.apache.org/jira/browse/SOLR-11233
> Project: Solr
>  Issue Type: Task
>Reporter: Christine Poerschke
>Assignee: Christine Poerschke
>Priority: Minor
> Attachments: SOLR-11233.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {{GC_LOG_OPTS}} customisation is currently supported but (pre Java 9) the 
> customised settings are supplemented e.g. 
> https://github.com/apache/lucene-solr/blob/releases/lucene-solr/6.6.0/solr/bin/solr#L1713
> {code}
> GC_LOG_OPTS+=("$gc_log_flag:$SOLR_LOGS_DIR/solr_gc.log" 
> '-XX:+UseGCLogFileRotation' '-XX:NumberOfGCLogFiles=9' 
> '-XX:GCLogFileSize=20M')
> {code}
> This seems unexpected and confusing. Some ideas for making it less confusing:
> * a new {{GC_LOG_FILE_OPTS}} option
> ** the new option can be customised but if unset it would default to existing 
> behaviour
> * use customised GC_LOG_OPTS 'as is'
> ** this would be a change to existing behaviour i.e. the 
> [solr#L1713|https://github.com/apache/lucene-solr/blob/releases/lucene-solr/6.6.0/solr/bin/solr#L1713]
>  settings mentioned above would no longer be appended



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] cpoerschke opened a new pull request #2284: SOLR-11233: Add optional JAVA8_GC_LOG_FILE_OPTS for bin/solr.

2021-02-01 Thread GitBox


cpoerschke opened a new pull request #2284:
URL: https://github.com/apache/lucene-solr/pull/2284


   https://issues.apache.org/jira/browse/SOLR-11233
   
   * {{bin/solr}} change -- ready for review
   * {{bin/solr.cmd}} change -- to do (help needed)
   
   Since `master` branch does not support Java8 this change is best tested on 
`branch_8x` but could then potentially still be cherry-picked to `master` 
branch if we want to keep the `bin/solr*` code on both branches similar.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mayya-sharipova commented on a change in pull request #2256: LUCENE-9507 Custom order for leaves in IndexReader and IndexWriter

2021-02-01 Thread GitBox


mayya-sharipova commented on a change in pull request #2256:
URL: https://github.com/apache/lucene-solr/pull/2256#discussion_r567994551



##
File path: lucene/core/src/java/org/apache/lucene/index/IndexWriter.java
##
@@ -933,6 +936,31 @@ protected final void ensureOpen() throws 
AlreadyClosedException {
* low-level IO error
*/
   public IndexWriter(Directory d, IndexWriterConfig conf) throws IOException {
+this(d, conf, null);
+  }
+
+  /**
+   * Constructs a new IndexWriter per the settings given in conf. 
If you want to make
+   * "live" changes to this writer instance, use {@link #getConfig()}.
+   *
+   * NOTE: after ths writer is created, the given configuration 
instance cannot be passed
+   * to another writer.
+   *
+   * @param d the index directory. The index is either created or appended 
according 
+   * conf.getOpenMode().
+   * @param conf the configuration settings according to which IndexWriter 
should be initialized.
+   * @param leafSorter a comparator for sorting leaf readers. Providing 
leafSorter is useful for
+   * indices on which it is expected to run many queries with particular 
sort criteria (e.g. for
+   * time-based indices this is usually a descending sort on timestamp). 
In this case {@code
+   * leafSorter} should sort leaves according to this sort criteria. 
Providing leafSorter allows
+   * to speed up this particular type of sort queries by early terminating 
while iterating
+   * though segments and segments' documents.
+   * @throws IOException if the directory cannot be read/written to, or if it 
does not exist and
+   * conf.getOpenMode() is OpenMode.APPEND or if 
there is any other
+   * low-level IO error
+   */
+  public IndexWriter(Directory d, IndexWriterConfig conf, 
Comparator leafSorter)

Review comment:
   Thank you for your suggestion, I will explore this.
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mayya-sharipova commented on a change in pull request #2256: LUCENE-9507 Custom order for leaves in IndexReader and IndexWriter

2021-02-01 Thread GitBox


mayya-sharipova commented on a change in pull request #2256:
URL: https://github.com/apache/lucene-solr/pull/2256#discussion_r567994353



##
File path: lucene/core/src/java/org/apache/lucene/index/IndexWriter.java
##
@@ -941,6 +969,11 @@ public IndexWriter(Directory d, IndexWriterConfig conf) 
throws IOException {
 // obtain the write.lock. If the user configured a timeout,
 // we wrap with a sleeper and this might take some time.
 writeLock = d.obtainLock(WRITE_LOCK_NAME);
+if (config.getIndexSort() != null && leafSorter != null) {
+  throw new IllegalArgumentException(
+  "[IndexWriter] can't use index sort and leaf sorter at the same 
time!");

Review comment:
   @mikemccand Thank you for your review! From the discussion on the Jira 
ticket, we also wanted to use writer's `leafSorter` during merging for 
arranging docs in  a merged segment (by putting first docs from a segment with 
highest sort values according to `leafSorter`).
   
   This will be in conflict with `indexSorter`, as if provided [it will arrange 
merged docs according to its 
sort](https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/index/MergeState.java#L211).





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] donnerpeter opened a new pull request #2283: LUCENE-9719: Resource files aren't deleted from build directory after…

2021-02-01 Thread GitBox


donnerpeter opened a new pull request #2283:
URL: https://github.com/apache/lucene-solr/pull/2283


   … being deleted in source
   
   
   
   
   # Description
   
   e.g. *.sug files used in Hunspell
   
   # Solution
   
   Make Gradle `Sync` them instead of `Copy`
   
   # Tests
   
   No
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [x] I have reviewed the guidelines for [How to 
Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms 
to the standards described there to the best of my ability.
   - [x] I have created a Jira issue and added the issue ID to my pull request 
title.
   - [x] I have given Solr maintainers 
[access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork)
 to contribute to my PR branch. (optional but recommended)
   - [x] I have developed this patch against the `master` branch.
   - [x] I have run `./gradlew check`.
   - [ ] I have added tests for my changes.
   - [ ] I have added documentation for the [Ref 
Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) 
(for Solr changes only).
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-9719) Resource files aren't deleted from build directory after being deleted in source

2021-02-01 Thread Peter Gromov (Jira)
Peter Gromov created LUCENE-9719:


 Summary: Resource files aren't deleted from build directory after 
being deleted in source
 Key: LUCENE-9719
 URL: https://issues.apache.org/jira/browse/LUCENE-9719
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Peter Gromov


e.g. *.sug files used in Hunspell



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] ctargett commented on a change in pull request #2275: SOLR-15123: Make all Tool option descriptions follow the same general pattern.

2021-02-01 Thread GitBox


ctargett commented on a change in pull request #2275:
URL: https://github.com/apache/lucene-solr/pull/2275#discussion_r567970635



##
File path: solr/core/src/java/org/apache/solr/util/SolrCLI.java
##
@@ -1795,25 +1795,25 @@ public ConfigSetUploadTool(PrintStream stdout) {
   .argName("confname") // Comes out in help message
   .hasArg() // Has one sub-argument
   .required(true) // confname argument must be present
-  .desc("Configset name on Zookeeper")
+  .desc("Configset name on Zookeeper.")

Review comment:
   Also, 'ZooKeeper'. I'll stop marking them, but there are several others 
to fix.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] ctargett commented on a change in pull request #2275: SOLR-15123: Make all Tool option descriptions follow the same general pattern.

2021-02-01 Thread GitBox


ctargett commented on a change in pull request #2275:
URL: https://github.com/apache/lucene-solr/pull/2275#discussion_r567970635



##
File path: solr/core/src/java/org/apache/solr/util/SolrCLI.java
##
@@ -1795,25 +1795,25 @@ public ConfigSetUploadTool(PrintStream stdout) {
   .argName("confname") // Comes out in help message
   .hasArg() // Has one sub-argument
   .required(true) // confname argument must be present
-  .desc("Configset name on Zookeeper")
+  .desc("Configset name on Zookeeper.")

Review comment:
   Also, 'ZooKeeper'.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] ctargett commented on a change in pull request #2275: SOLR-15123: Make all Tool option descriptions follow the same general pattern.

2021-02-01 Thread GitBox


ctargett commented on a change in pull request #2275:
URL: https://github.com/apache/lucene-solr/pull/2275#discussion_r567970209



##
File path: solr/core/src/java/org/apache/solr/util/SolrCLI.java
##
@@ -1311,13 +1311,13 @@ protected void runCloudTool(CloudSolrClient 
cloudSolrClient, CommandLine cli) th
   .argName("HOST")
   .hasArg()
   .required(false)
-  .desc("Address of the Zookeeper ensemble; defaults to: " + ZK_HOST)
+  .desc("Address of the Zookeeper ensemble; defaults to: " + ZK_HOST + 
'.')

Review comment:
   Should be 'ZooKeeper'





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] ctargett commented on a change in pull request #2275: SOLR-15123: Make all Tool option descriptions follow the same general pattern.

2021-02-01 Thread GitBox


ctargett commented on a change in pull request #2275:
URL: https://github.com/apache/lucene-solr/pull/2275#discussion_r567969883



##
File path: solr/core/src/java/org/apache/solr/util/SolrCLI.java
##
@@ -236,13 +236,13 @@ protected abstract void runCloudTool(CloudSolrClient 
cloudSolrClient, CommandLin
   .argName("HOST")
   .hasArg()
   .required(false)
-  .desc("Address of the Zookeeper ensemble; defaults to: "+ZK_HOST)
+  .desc("Address of the Zookeeper ensemble; defaults to: "+ ZK_HOST + 
'.')

Review comment:
   ZK is properly spelled 'ZooKeeper' (capital 'K').





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (SOLR-15126) gc log file rotation broken (when using Java8)

2021-02-01 Thread Christine Poerschke (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-15126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christine Poerschke resolved SOLR-15126.

Resolution: Cannot Reproduce

Never mind, apologies for the noise, I must have mixed up and confused 
something and can no longer reproduce the issue. And experimentation confirms 
that array appending to array is possible in shell script e.g.

{code}
$ FOO="aaa bbb ccc"
$ BAR=($FOO)
$ for i in "${!BAR[@]}"; do echo $i; echo ${BAR[$i]}; done
0
aaa
1
bbb
2
ccc
$ BAR+=("xxx" 'yyy' 'zzz')
$ for i in "${!BAR[@]}"; do echo $i; echo ${BAR[$i]}; done
0
aaa
1
bbb
2
ccc
3
xxx
4
yyy
5
zzz
{code}

> gc log file rotation broken (when using Java8)
> --
>
> Key: SOLR-15126
> URL: https://issues.apache.org/jira/browse/SOLR-15126
> Project: Solr
>  Issue Type: Bug
>Reporter: Christine Poerschke
>Assignee: Christine Poerschke
>Priority: Minor
>
> Noticed when investigating SOLR-11233 which concerns Java8 use. Based on the 
> shell script content we intend to use log rotation but from observation in 
> practice that does not happen (details to follow).
> SOLR-15104 is related but potentially wider in scope (non-Java8 and rotation 
> not just within the same JVM but across successive JVMs).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] madrob commented on a change in pull request #2275: SOLR-15123: Make all Tool option descriptions follow the same general pattern.

2021-02-01 Thread GitBox


madrob commented on a change in pull request #2275:
URL: https://github.com/apache/lucene-solr/pull/2275#discussion_r567953137



##
File path: solr/core/src/java/org/apache/solr/util/ExportTool.java
##
@@ -216,32 +216,32 @@ void end() throws IOException {
   Option.builder("url")
   .hasArg()
   .required()
-  .desc("Address of the collection, example 
http://localhost:8983/solr/gettingstarted;)
+  .desc("Address of the collection, example 
http://localhost:8983/solr/gettingstarted.;)

Review comment:
   I think this will be confusing to people if the period is part of the 
URL or not (if they are copy pasting from a tutorial perhaps.

##
File path: solr/core/src/java/org/apache/solr/util/PackageTool.java
##
@@ -261,44 +261,44 @@ protected void runImpl(CommandLine cli) throws Exception {
 .argName("URL")
 .hasArg()
 .required(true)
-.desc("Address of the Solr Web application, defaults to: " + 
SolrCLI.DEFAULT_SOLR_URL)
+.desc("Address of the Solr Web application, defaults to: " + 
SolrCLI.DEFAULT_SOLR_URL + '.')
 .build(),
 
 Option.builder("collections")
 .argName("COLLECTIONS")
 .hasArg()
 .required(false)
-.desc("List of collections. Run './solr package help' for more 
details.")
+.desc("List of collections.")
 .build(),
 
 Option.builder("cluster")
 .required(false)
-.desc("Needed to install cluster level plugins in a package. Run 
'./solr package help' for more details.")
+.desc("Needed to install cluster level plugins in a package.")

Review comment:
   I'm not sure what this description means. There is no verb?

##
File path: solr/core/src/java/org/apache/solr/util/SolrCLI.java
##
@@ -1795,25 +1795,25 @@ public ConfigSetUploadTool(PrintStream stdout) {
   .argName("confname") // Comes out in help message
   .hasArg() // Has one sub-argument
   .required(true) // confname argument must be present
-  .desc("Configset name on Zookeeper")
+  .desc("Configset name on Zookeeper.")

Review comment:
   s/on/in maybe?

##
File path: solr/core/src/java/org/apache/solr/util/SolrCLI.java
##
@@ -1942,19 +1942,19 @@ public ZkRmTool(PrintStream stdout) {
   .argName("path")
   .hasArg()
   .required(true)
-  .desc("Path to remove")
+  .desc("Path to remove.")
   .build(),
   Option.builder("recurse")
   .argName("recurse")
   .hasArg()
   .required(false)
-  .desc("Recurse (true|false, default is false)")
+  .desc("Recurse (true|false), default is false.")

Review comment:
   These parentheses are inconsistent with the other tools





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-15124) Remove node/container level admin handlers from ImplicitPlugins.json (core level).

2021-02-01 Thread Mike Drob (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-15124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276436#comment-17276436
 ] 

Mike Drob commented on SOLR-15124:
--

Should we consider setting up redirect paths for the old handlers? Or a better 
error message with a hint that they have moved?

Also, will need to update SolrCoreTest.testImplicitPlugins, which I don't think 
the existing PR did.

> Remove node/container level admin handlers from ImplicitPlugins.json (core 
> level).
> --
>
> Key: SOLR-15124
> URL: https://issues.apache.org/jira/browse/SOLR-15124
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: David Smiley
>Priority: Blocker
>  Labels: newdev
> Fix For: master (9.0)
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> There are many very old administrative RequestHandlers registered in a 
> SolrCore that are actually JVM / node / CoreContainer level in nature.  These 
> pre-dated CoreContainer level handlers.  We should (1) remove them from 
> ImplictPlugins.json, and (2) make simplifying tweaks to them to remove that 
> they work at the core level.  For example LoggingHandler has two constructors 
> and a non-final Watcher because it works in these two modalities.  It need 
> only have the one that takes a CoreContainer, and Watcher will then be final.
> /admin/threads
> /admin/properties
> /admin/logging
> Should stay because has core-level stuff:
> /admin/plugins
> /admin/mbeans
> This one:
> /admin/system -- SystemInfoHandler
> returns "core" level information, and also node level stuff.  I propose 
> splitting this one to a CoreInfoHandler to split the logic.  Maybe a separate 
> issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] madrob commented on a change in pull request #2265: SOLR-15119 Add logs and make default splitMethod to be LINK

2021-02-01 Thread GitBox


madrob commented on a change in pull request #2265:
URL: https://github.com/apache/lucene-solr/pull/2265#discussion_r567947008



##
File path: 
solr/core/src/java/org/apache/solr/cloud/api/collections/SplitShardCmd.java
##
@@ -580,6 +597,10 @@ public boolean split(ClusterState clusterState, 
ZkNodeProps message, NamedList

[jira] [Created] (SOLR-15126) gc log file rotation broken (when using Java8)

2021-02-01 Thread Christine Poerschke (Jira)
Christine Poerschke created SOLR-15126:
--

 Summary: gc log file rotation broken (when using Java8)
 Key: SOLR-15126
 URL: https://issues.apache.org/jira/browse/SOLR-15126
 Project: Solr
  Issue Type: Bug
Reporter: Christine Poerschke
Assignee: Christine Poerschke


Noticed when investigating SOLR-11233 which concerns Java8 use. Based on the 
shell script content we intend to use log rotation but from observation in 
practice that does not happen (details to follow).

SOLR-15104 is related but potentially wider in scope (non-Java8 and rotation 
not just within the same JVM but across successive JVMs).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



  1   2   >