[jira] [Commented] (SOLR-14869) [child] transformer includes deleted documents

2020-09-23 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17201252#comment-17201252
 ] 

David Smiley commented on SOLR-14869:
-

Eeek!  I think this bug is my fault.

> [child] transformer includes deleted documents
> --
>
> Key: SOLR-14869
> URL: https://issues.apache.org/jira/browse/SOLR-14869
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Chris M. Hostetter
>Priority: Major
>
> If you have nested documents that include "deleted" docs (ie: using "delete 
> by query") the {{\[child\]}} transformer still include those deleted docs i 
> it's output



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14868) nested document 'remove' operation should "delete in place"

2020-09-23 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17201250#comment-17201250
 ] 

David Smiley commented on SOLR-14868:
-

+1 to the idea

> nested document 'remove' operation should "delete in place"
> ---
>
> Key: SOLR-14868
> URL: https://issues.apache.org/jira/browse/SOLR-14868
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Chris M. Hostetter
>Priority: Major
>
> Currently, if you use the "remove" atomic update operation for nested 
> documents, it completely re-indexes the "root" document and all remaining 
> child documents (after logically removing the specified child and i'ts 
> descendants)
> ideally, unless this operation is combined with other atomic update 
> operations, "remove" should be done "in place" -- just using the IndexWriter 
> to mark the specified child document (and it's descendants) as deleted.
> (The full Re-Indexing of the 'root' level document can be very slow for large 
> deeply nested documents)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14861) CoreContainer shutdown needs to be aware of other ongoing operations and wait until they're complete

2020-09-23 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17201248#comment-17201248
 ] 

David Smiley commented on SOLR-14861:
-

bq. I think we need a way for shutdown to somehow cause Solr to start refusing 
all incoming requests, wait until all in-flight operations are complete, and 
then start shutting down.

+1

> CoreContainer shutdown needs to be aware of other ongoing operations and wait 
> until they're complete
> 
>
> Key: SOLR-14861
> URL: https://issues.apache.org/jira/browse/SOLR-14861
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
> Attachments: SOLR-14861.patch
>
>
> Noble and I are trying to get to the bottom of the TestBulkSchemaConcurrent 
> failures and found what looks like a glaring gap in how 
> CoreContainer.shutdown operates. I don't know the impact on production since 
> we're shutting down anyway, but I think this is responsible for the errors in 
> TestBulkSchemaConcurrent and likely behind others, especially any other test 
> that fails intermittently that involves core reloads, including and 
> especially any tests that exercise managed schema.
> We have clear evidence of this sequence:
> 1> some CoreContainer.reloads come in and get _partway_ through, in 
> particular past the test at the top where CoreContainer.reload() throws an 
> AlreadyClosed exception if (isShutdown).
> 2> Some CoreContainer.shutdown() threads get some processing time before the 
> reloads in <1> are finished.
> 3> the threads in <1> pick back up and go wonky. I suspect that there are a 
> number of different things that could be going wrong here depending on how 
> far through CoreContainer.shutdown() gets that pop out in different ways.
> Since it's my shift (Noble has to sleep sometime), I put some crude locking 
> in just to test the idea; incrementing an AtomicInteger on entry to 
> CoreContainer.reload then decrementing it at the end, and spinning in 
> CoreContainer.shutdown() until the AtomicInteger was back to zero. With that 
> in place, 100 runs and no errors whereas before I could never get even 10 
> runs to finish without an error. This is not a proper fix at all, and the way 
> it's currently running there are still possible race conditions, just much 
> smaller windows. And I suspect it risks spinning forever. But it's enough to 
> make me believe I finally understand what's happening.
> I also suspect that reload is more sensitive than most operations on a core 
> due to the fact that it runs for a long time, but I assume other operations 
> have the same potential. Shouldn't CoreContainer.shutDown() wait until no 
> other operations are in flight?
> On a quick scan of CoreContainer, there are actually few places where we even 
> check for isShutdown, I suspect the places we do are ad-hoc that we've found 
> by trial-and-error when tests fail. We need a design rather than hit-or-miss 
> hacking.
> I think that isShutdown should be replaced with something more robust. What 
> that is IDK quite yet because I've been hammering at this long enough and I 
> need a break.
> This is consistent with another observation about this particular test. If 
> there's sleep at the end, it wouldn't fail; all the reloads get a chance to 
> finish before anything was shut down.
> An open question how much this matters to production systems. In the testing 
> case, bunches of these reloads are issued then we immediately end the test 
> and start shutting things down. It needs to be fixed if we're going to cut 
> down on test failures though. Besides, it's just wrong ;)
> Assigning to myself to track. I'd be perfectly happy, now that Noble and I 
> have done the hard work, for someone to swoop in and take the credit for 
> fixing it ;)
> gradlew beast -Ptests.dups=10 --tests TestBulkSchemaConcurrent
> always fails for me on current code without my hack...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14894) Use annotations to implement V2 collection APIs

2020-09-23 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17201234#comment-17201234
 ] 

ASF subversion and git services commented on SOLR-14894:


Commit 1f26043c1c2465f1e4dfcc61e7026dfe5b8b452c in lucene-solr's branch 
refs/heads/branch_8x from noblepaul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=1f26043 ]

SOLR-14894: Use annotations to implement V2 collection APIs


> Use annotations to implement V2 collection APIs
> ---
>
> Key: SOLR-14894
> URL: https://issues.apache.org/jira/browse/SOLR-14894
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>
> This is a refactoring exercise



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14894) Use annotations to implement V2 collection APIs

2020-09-23 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17201235#comment-17201235
 ] 

ASF subversion and git services commented on SOLR-14894:


Commit 7eb2961590fa555d063f425bd8d47ea34bbfd17b in lucene-solr's branch 
refs/heads/branch_8x from noblepaul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=7eb2961 ]

SOLR-14894: ASL header


> Use annotations to implement V2 collection APIs
> ---
>
> Key: SOLR-14894
> URL: https://issues.apache.org/jira/browse/SOLR-14894
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>
> This is a refactoring exercise



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14894) Use annotations to implement V2 collection APIs

2020-09-23 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17201228#comment-17201228
 ] 

ASF subversion and git services commented on SOLR-14894:


Commit 26bb6415d12a48640d92ca25765e9e43ec087145 in lucene-solr's branch 
refs/heads/master from noblepaul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=26bb641 ]

SOLR-14894: ASL header


> Use annotations to implement V2 collection APIs
> ---
>
> Key: SOLR-14894
> URL: https://issues.apache.org/jira/browse/SOLR-14894
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>
> This is a refactoring exercise



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14894) Use annotations to implement V2 collection APIs

2020-09-23 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17201226#comment-17201226
 ] 

ASF subversion and git services commented on SOLR-14894:


Commit 565c5b1ac4ac0b7beca0361607846119f5902af4 in lucene-solr's branch 
refs/heads/master from noblepaul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=565c5b1 ]

SOLR-14894: Use annotations to implement V2 collection APIs


> Use annotations to implement V2 collection APIs
> ---
>
> Key: SOLR-14894
> URL: https://issues.apache.org/jira/browse/SOLR-14894
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>
> This is a refactoring exercise



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14894) Use annotations to implement V2 collection APIs

2020-09-23 Thread Noble Paul (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-14894:
--
Description: This is a refactoring exercise

> Use annotations to implement V2 collection APIs
> ---
>
> Key: SOLR-14894
> URL: https://issues.apache.org/jira/browse/SOLR-14894
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>
> This is a refactoring exercise



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mocobeta commented on pull request #1836: LUCENE-9317: Clean up split package in analyzers-common

2020-09-23 Thread GitBox


mocobeta commented on pull request #1836:
URL: https://github.com/apache/lucene-solr/pull/1836#issuecomment-698064481


   Thanks @dweiss and @uschindler, I will make the necessary changes and let 
you know then.
   
   > precommit doesn't pass though
   
   Yes, because many tests need to be fixed (I changed only main classes for 
early stage review).
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14894) Use annotations to implement V2 collection APIs

2020-09-23 Thread Noble Paul (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-14894:
--
Summary: Use annotations to implement V2 collection APIs  (was: Use annot)

> Use annotations to implement V2 collection APIs
> ---
>
> Key: SOLR-14894
> URL: https://issues.apache.org/jira/browse/SOLR-14894
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Assigned] (SOLR-14894) Use annotations to implement V2 collection APIs

2020-09-23 Thread Noble Paul (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul reassigned SOLR-14894:
-

Assignee: Noble Paul

> Use annotations to implement V2 collection APIs
> ---
>
> Key: SOLR-14894
> URL: https://issues.apache.org/jira/browse/SOLR-14894
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14894) Use annot

2020-09-23 Thread Noble Paul (Jira)
Noble Paul created SOLR-14894:
-

 Summary: Use annot
 Key: SOLR-14894
 URL: https://issues.apache.org/jira/browse/SOLR-14894
 Project: Solr
  Issue Type: Task
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Noble Paul






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] noblepaul edited a comment on pull request #1758: SOLR-14749: Provide a clean API for cluster-level event processing, Initial draft.

2020-09-23 Thread GitBox


noblepaul edited a comment on pull request #1758:
URL: https://github.com/apache/lucene-solr/pull/1758#issuecomment-698054014


   Is it possible to explain the changes to public touch points?
   
   * an example of how you register some type of plugin
   * how it looks in some JSON in ZK
   * If a plugin can be registered using a public API, is there a testcase for 
the same?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] noblepaul commented on pull request #1758: SOLR-14749: Provide a clean API for cluster-level event processing, Initial draft.

2020-09-23 Thread GitBox


noblepaul commented on pull request #1758:
URL: https://github.com/apache/lucene-solr/pull/1758#issuecomment-698054014


   Is it possible to explain the changes to public touch points?
   
   * an example of how you register some type of plugin
   * how it looks in some JSON in ZK
   * If a plugin can be registered , using a public API, is there a testcase 
for the same?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] noblepaul commented on a change in pull request #1758: SOLR-14749: Provide a clean API for cluster-level event processing, Initial draft.

2020-09-23 Thread GitBox


noblepaul commented on a change in pull request #1758:
URL: https://github.com/apache/lucene-solr/pull/1758#discussion_r493975213



##
File path: solr/core/src/java/org/apache/solr/api/CustomContainerPlugins.java
##
@@ -78,6 +79,10 @@ public void writeMap(EntryWriter ew) throws IOException {
 currentPlugins.forEach(ew.getBiConsumer());
   }
 
+  public synchronized ApiInfo getPlugin(String name) {

Review comment:
   why is this synchronized ?

##
File path: solr/core/src/java/org/apache/solr/core/CoreContainer.java
##
@@ -889,7 +896,37 @@ public void load() {
   ContainerPluginsApi containerPluginsApi = new ContainerPluginsApi(this);
   
containerHandlers.getApiBag().registerObject(containerPluginsApi.readAPI);
   
containerHandlers.getApiBag().registerObject(containerPluginsApi.editAPI);
+
+  // create the ClusterEventProducer
+  CustomContainerPlugins.ApiInfo clusterEventProducerInfo = 
customContainerPlugins.getPlugin(ClusterEventProducer.PLUGIN_NAME);
+  if (clusterEventProducerInfo != null) {
+clusterEventProducer = (ClusterEventProducer) 
clusterEventProducerInfo.getInstance();
+  } else {
+clusterEventProducer = new ClusterEventProducerImpl(this);
+  }
+  // init ClusterSingleton-s
+  Map singletons = new ConcurrentHashMap<>();
+  if (clusterEventProducer instanceof ClusterSingleton) {
+singletons.put(ClusterEventProducer.PLUGIN_NAME, (ClusterSingleton) 
clusterEventProducer);
+  }
+
+  // register ClusterSingleton handlers
+  // XXX register also other ClusterSingleton-s from packages - how?
+  containerHandlers.keySet().forEach(handlerName -> {

Review comment:
   what are you trying to do here?
   
   I'm totally confused about
   
   - What are the objects you are registering
   - where they are registered?
   - what is the purpose?

##
File path: 
solr/core/src/java/org/apache/solr/handler/admin/ContainerPluginsApi.java
##
@@ -64,15 +64,15 @@ public ContainerPluginsApi(CoreContainer coreContainer) {
 
   public class Read {
 @EndPoint(method = METHOD.GET,
-path = "/cluster/plugin",
+path = "/cluster/plugins",

Review comment:
   you know this is a backward incompatible change , right?

##
File path: 
solr/core/src/java/org/apache/solr/handler/admin/ContainerPluginsApi.java
##
@@ -64,15 +64,15 @@ public ContainerPluginsApi(CoreContainer coreContainer) {
 
   public class Read {
 @EndPoint(method = METHOD.GET,
-path = "/cluster/plugin",
+path = "/cluster/plugins",
 permission = PermissionNameProvider.Name.COLL_READ_PERM)
 public void list(SolrQueryRequest req, SolrQueryResponse rsp) throws 
IOException {
-  rsp.add(PLUGIN, plugins(zkClientSupplier));
+  rsp.add(PLUGINS, plugins(zkClientSupplier));

Review comment:
   this too is backward incompatible

##
File path: solr/core/src/java/org/apache/solr/packagemanager/PackageManager.java
##
@@ -231,7 +232,7 @@ public void uninstall(String packageName, String version) {
   }
 }
 @SuppressWarnings({"unchecked"})
-Map clusterPlugins = (Map) 
result.getOrDefault("plugin", Collections.emptyMap());
+Map clusterPlugins = (Map) 
result.getOrDefault(ContainerPluginsApi.PLUGINS, Collections.emptyMap());

Review comment:
   backward incompatble





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (SOLR-14890) Refactor code to use annotations for configset API

2020-09-23 Thread Noble Paul (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul resolved SOLR-14890.
---
Fix Version/s: 8.7
   Resolution: Fixed

> Refactor code to use annotations for configset API
> --
>
> Key: SOLR-14890
> URL: https://issues.apache.org/jira/browse/SOLR-14890
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
> Fix For: 8.7
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14890) Refactor code to use annotations for configset API

2020-09-23 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17201194#comment-17201194
 ] 

ASF subversion and git services commented on SOLR-14890:


Commit 1c9c1509fa08ff45f8caaebafd7d90a557dd3643 in lucene-solr's branch 
refs/heads/master from noblepaul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=1c9c150 ]

SOLR-14890: syncing with 8x


> Refactor code to use annotations for configset API
> --
>
> Key: SOLR-14890
> URL: https://issues.apache.org/jira/browse/SOLR-14890
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
> Fix For: 8.7
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (SOLR-14883) Add a Muse (CI) configuration file

2020-09-23 Thread Tomas Eduardo Fernandez Lobbe (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomas Eduardo Fernandez Lobbe resolved SOLR-14883.
--
Fix Version/s: master (9.0)
   Resolution: Done

> Add a Muse (CI) configuration file
> --
>
> Key: SOLR-14883
> URL: https://issues.apache.org/jira/browse/SOLR-14883
> Project: Solr
>  Issue Type: Test
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Thomas DuBuisson
>Priority: Trivial
> Fix For: master (9.0)
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> This ticket is a continuation of conversation that started 
> https://issues.apache.org/jira/browse/SOLR-14819?filter=-2 and was also on 
> the mailing list.
>  
> The Apache infrastructure team has installed the Muse GitHub application.  In 
> order for Lucene-Solr to benefit the application must understand how to build 
> the project.  While it has build heuristics, it does not guess between JDK8 
> or JDK11 (two JDKs most supported by several static analysis tools).  Thus, 
> this configuration explicitly instructs use of JDK11.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Assigned] (SOLR-14883) Add a Muse (CI) configuration file

2020-09-23 Thread Tomas Eduardo Fernandez Lobbe (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomas Eduardo Fernandez Lobbe reassigned SOLR-14883:


Assignee: Tomas Eduardo Fernandez Lobbe

> Add a Muse (CI) configuration file
> --
>
> Key: SOLR-14883
> URL: https://issues.apache.org/jira/browse/SOLR-14883
> Project: Solr
>  Issue Type: Test
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Thomas DuBuisson
>Assignee: Tomas Eduardo Fernandez Lobbe
>Priority: Trivial
> Fix For: master (9.0)
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> This ticket is a continuation of conversation that started 
> https://issues.apache.org/jira/browse/SOLR-14819?filter=-2 and was also on 
> the mailing list.
>  
> The Apache infrastructure team has installed the Muse GitHub application.  In 
> order for Lucene-Solr to benefit the application must understand how to build 
> the project.  While it has build heuristics, it does not guess between JDK8 
> or JDK11 (two JDKs most supported by several static analysis tools).  Thus, 
> this configuration explicitly instructs use of JDK11.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14890) Refactor code to use annotations for configset API

2020-09-23 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17201190#comment-17201190
 ] 

ASF subversion and git services commented on SOLR-14890:


Commit a2dcba04e2082cccd1a2c3ef8f93ffccf7e1d879 in lucene-solr's branch 
refs/heads/branch_8x from noblepaul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=a2dcba0 ]

SOLR-14890: Refactor code to use annotations for configset API


> Refactor code to use annotations for configset API
> --
>
> Key: SOLR-14890
> URL: https://issues.apache.org/jira/browse/SOLR-14890
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14883) Add a Muse (CI) configuration file

2020-09-23 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17201187#comment-17201187
 ] 

ASF subversion and git services commented on SOLR-14883:


Commit 6599cc835a7c47162f0bef5cb37e71cee7590d19 in lucene-solr's branch 
refs/heads/master from Thomas M. DuBuisson
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=6599cc8 ]

SOLR-14883 Add a Muse (Continuous assurance platform) configuration (#1901)

* Add a Muse (Continuous assurance platform) configuration

The full documentation is docs.muse.dev.  Most interesting is the
repository configuration reference
https://docs.muse.dev/docs/repository-configuration/ which details how
to change which tools runs, add custom tools (other than those built
into the platform by default), filter for or against certain bug types,
etc.

For example, an explicit `.muse/config.toml` file might be:

```
jdk11 = true
build = "./gradlew assemble"
tools = [ "infer", "errorprone", "findsecbugs" ]
customTools = [ ]
```

* Add a comment to the toml file

> Add a Muse (CI) configuration file
> --
>
> Key: SOLR-14883
> URL: https://issues.apache.org/jira/browse/SOLR-14883
> Project: Solr
>  Issue Type: Test
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Thomas DuBuisson
>Priority: Trivial
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> This ticket is a continuation of conversation that started 
> https://issues.apache.org/jira/browse/SOLR-14819?filter=-2 and was also on 
> the mailing list.
>  
> The Apache infrastructure team has installed the Muse GitHub application.  In 
> order for Lucene-Solr to benefit the application must understand how to build 
> the project.  While it has build heuristics, it does not guess between JDK8 
> or JDK11 (two JDKs most supported by several static analysis tools).  Thus, 
> this configuration explicitly instructs use of JDK11.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] tflobbe merged pull request #1901: SOLR-14883 Add a Muse (Continuous assurance platform) configuration

2020-09-23 Thread GitBox


tflobbe merged pull request #1901:
URL: https://github.com/apache/lucene-solr/pull/1901


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14354) HttpShardHandler send requests in async

2020-09-23 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17201179#comment-17201179
 ] 

Cao Manh Dat commented on SOLR-14354:
-

Thank Mark for your nice words. 

[~ichattopadhyaya] I will try to do benchmark based on your project above. If 
I'm not be able to finish it before 8.7 release then reverting it will be a 
good option.

> HttpShardHandler send requests in async
> ---
>
> Key: SOLR-14354
> URL: https://issues.apache.org/jira/browse/SOLR-14354
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Blocker
> Fix For: master (9.0), 8.7
>
> Attachments: image-2020-03-23-10-04-08-399.png, 
> image-2020-03-23-10-09-10-221.png, image-2020-03-23-10-12-00-661.png
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> h2. 1. Current approach (problem) of Solr
> Below is the diagram describe the model on how currently handling a request.
> !image-2020-03-23-10-04-08-399.png!
> The main-thread that handles the search requests, will submit n requests (n 
> equals to number of shards) to an executor. So each request will correspond 
> to a thread, after sending a request that thread basically do nothing just 
> waiting for response from other side. That thread will be swapped out and CPU 
> will try to handle another thread (this is called context switch, CPU will 
> save the context of the current thread and switch to another one). When some 
> data (not all) come back, that thread will be called to parsing these data, 
> then it will wait until more data come back. So there will be lots of context 
> switching in CPU. That is quite inefficient on using threads.Basically we 
> want less threads and most of them must busy all the time, because threads 
> are not free as well as context switching. That is the main idea behind 
> everything, like executor
> h2. 2. Async call of Jetty HttpClient
> Jetty HttpClient offers async API like this.
> {code:java}
> httpClient.newRequest("http://domain.com/path;)
> // Add request hooks
> .onRequestQueued(request -> { ... })
> .onRequestBegin(request -> { ... })
> // Add response hooks
> .onResponseBegin(response -> { ... })
> .onResponseHeaders(response -> { ... })
> .onResponseContent((response, buffer) -> { ... })
> .send(result -> { ... }); {code}
> Therefore after calling {{send()}} the thread will return immediately without 
> any block. Then when the client received the header from other side, it will 
> call {{onHeaders()}} listeners. When the client received some {{byte[]}} (not 
> all response) from the data it will call {{onContent(buffer)}} listeners. 
> When everything finished it will call {{onComplete}} listeners. One main 
> thing that will must notice here is all listeners should finish quick, if the 
> listener block, all further data of that request won’t be handled until the 
> listener finish.
> h2. 3. Solution 1: Sending requests async but spin one thread per response
>  Jetty HttpClient already provides several listeners, one of them is 
> InputStreamResponseListener. This is how it is get used
> {code:java}
> InputStreamResponseListener listener = new InputStreamResponseListener();
> client.newRequest(...).send(listener);
> // Wait for the response headers to arrive
> Response response = listener.get(5, TimeUnit.SECONDS);
> if (response.getStatus() == 200) {
>   // Obtain the input stream on the response content
>   try (InputStream input = listener.getInputStream()) {
> // Read the response content
>   }
> } {code}
> In this case, there will be 2 thread
>  * one thread trying to read the response content from InputStream
>  * one thread (this is a short-live task) feeding content to above 
> InputStream whenever some byte[] is available. Note that if this thread 
> unable to feed data into InputStream, this thread will wait.
> By using this one, the model of HttpShardHandler can be written into 
> something like this
> {code:java}
> handler.sendReq(req, (is) -> {
>   executor.submit(() ->
> try (is) {
>   // Read the content from InputStream
> }
>   )
> }) {code}
>  The first diagram will be changed into this
> !image-2020-03-23-10-09-10-221.png!
> Notice that although “sending req to shard1” is wide, it won’t take long time 
> since sending req is a very quick operation. With this operation, handling 
> threads won’t be spin up until first bytes are sent back. Notice that in this 
> approach we still have active threads waiting for more data from InputStream
> h2. 4. Solution 2: Buffering data and handle it inside jetty’s thread.
> Jetty have another listener called BufferingResponseListener. This is how it 
> is get used
> {code:java}
> client.newRequest(...).send(new 

[jira] [Commented] (SOLR-14889) improve templated variable escaping in ref-guide _config.yml

2020-09-23 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17201177#comment-17201177
 ] 

Uwe Schindler commented on SOLR-14889:
--

Small update:  [^SOLR-14889.patch] 

"replaceAll" is wrong, must be "replace" (as we dont use a regex). Typical Java 
error!

> improve templated variable escaping in ref-guide _config.yml
> 
>
> Key: SOLR-14889
> URL: https://issues.apache.org/jira/browse/SOLR-14889
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Chris M. Hostetter
>Assignee: Chris M. Hostetter
>Priority: Major
> Attachments: SOLR-14889.patch, SOLR-14889.patch, SOLR-14889.patch
>
>
> SOLR-14824 ran into windows failures when we switching from using a hardcoded 
> "relative" path to the solrRootPath  to using groovy/project variables to get 
> the path.  the reason for the failures was that the path us used as a 
> variable tempted into {{_config.yml.template}} to build the {{_config.yml}} 
> file, but on windows the path seperater of '\' was being parsed by 
> jekyll/YAML as a string escape character.
> (This wasn't a problem we ran into before, even on windows, prior to the 
> SOLR-14824 changes, because the hardcoded relative path only used '/' 
> delimiters, which (j)ruby was happy to work with, even on windows.
> As Uwe pointed out when hotfixing this...
> {quote}Problem was that backslashes are used to escape strings, but windows 
> paths also have those. Fix was to add StringEscapeUtils, but I don't like 
> this too much. Maybe we find a better solution to make special characters in 
> those properties escaped correctly when used in strings inside templates.
> {quote}
> ...the current fix of using {{StringEscapeUtils.escapeJava}} - only for this 
> one variable -- doesn't really protect other variables that might have 
> special charactes in them down the road, and while "escapeJava" work ok for 
> the "\" issue, it isn't neccessarily consistent with all YAML escapse, which 
> could lead to even weird bugs/cofusion down the road.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14889) improve templated variable escaping in ref-guide _config.yml

2020-09-23 Thread Uwe Schindler (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated SOLR-14889:
-
Attachment: SOLR-14889.patch

> improve templated variable escaping in ref-guide _config.yml
> 
>
> Key: SOLR-14889
> URL: https://issues.apache.org/jira/browse/SOLR-14889
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Chris M. Hostetter
>Assignee: Chris M. Hostetter
>Priority: Major
> Attachments: SOLR-14889.patch, SOLR-14889.patch, SOLR-14889.patch
>
>
> SOLR-14824 ran into windows failures when we switching from using a hardcoded 
> "relative" path to the solrRootPath  to using groovy/project variables to get 
> the path.  the reason for the failures was that the path us used as a 
> variable tempted into {{_config.yml.template}} to build the {{_config.yml}} 
> file, but on windows the path seperater of '\' was being parsed by 
> jekyll/YAML as a string escape character.
> (This wasn't a problem we ran into before, even on windows, prior to the 
> SOLR-14824 changes, because the hardcoded relative path only used '/' 
> delimiters, which (j)ruby was happy to work with, even on windows.
> As Uwe pointed out when hotfixing this...
> {quote}Problem was that backslashes are used to escape strings, but windows 
> paths also have those. Fix was to add StringEscapeUtils, but I don't like 
> this too much. Maybe we find a better solution to make special characters in 
> those properties escaped correctly when used in strings inside templates.
> {quote}
> ...the current fix of using {{StringEscapeUtils.escapeJava}} - only for this 
> one variable -- doesn't really protect other variables that might have 
> special charactes in them down the road, and while "escapeJava" work ok for 
> the "\" issue, it isn't neccessarily consistent with all YAML escapse, which 
> could lead to even weird bugs/cofusion down the road.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14889) improve templated variable escaping in ref-guide _config.yml

2020-09-23 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17201164#comment-17201164
 ] 

Uwe Schindler commented on SOLR-14889:
--

I also changed the logger.warn to logger.lifecycle when outputting the 
properties.

> improve templated variable escaping in ref-guide _config.yml
> 
>
> Key: SOLR-14889
> URL: https://issues.apache.org/jira/browse/SOLR-14889
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Chris M. Hostetter
>Assignee: Chris M. Hostetter
>Priority: Major
> Attachments: SOLR-14889.patch, SOLR-14889.patch
>
>
> SOLR-14824 ran into windows failures when we switching from using a hardcoded 
> "relative" path to the solrRootPath  to using groovy/project variables to get 
> the path.  the reason for the failures was that the path us used as a 
> variable tempted into {{_config.yml.template}} to build the {{_config.yml}} 
> file, but on windows the path seperater of '\' was being parsed by 
> jekyll/YAML as a string escape character.
> (This wasn't a problem we ran into before, even on windows, prior to the 
> SOLR-14824 changes, because the hardcoded relative path only used '/' 
> delimiters, which (j)ruby was happy to work with, even on windows.
> As Uwe pointed out when hotfixing this...
> {quote}Problem was that backslashes are used to escape strings, but windows 
> paths also have those. Fix was to add StringEscapeUtils, but I don't like 
> this too much. Maybe we find a better solution to make special characters in 
> those properties escaped correctly when used in strings inside templates.
> {quote}
> ...the current fix of using {{StringEscapeUtils.escapeJava}} - only for this 
> one variable -- doesn't really protect other variables that might have 
> special charactes in them down the road, and while "escapeJava" work ok for 
> the "\" issue, it isn't neccessarily consistent with all YAML escapse, which 
> could lead to even weird bugs/cofusion down the road.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-14889) improve templated variable escaping in ref-guide _config.yml

2020-09-23 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17201162#comment-17201162
 ] 

Uwe Schindler edited comment on SOLR-14889 at 9/23/20, 11:27 PM:
-

Here is my fix:
 [^SOLR-14889.patch] 

You need to create the empty map first and then populate it with escaped 
properties in doFirst. During configuration, the expand() method gets the empty 
map, which is populated in doFirst.

This is a quick hack; I don't like it. Maybe I have an idea this night.


was (Author: thetaphi):
Here is my fix:
 [^SOLR-14889.patch] 

You need to create the expty map first and then populate it with escaped 
properties in doFirst. During configuration, the expand() method gets the empty 
map, which is populated in doFirst.

This is a quick hack; I don't like it. Maybe I have an idea this night.

> improve templated variable escaping in ref-guide _config.yml
> 
>
> Key: SOLR-14889
> URL: https://issues.apache.org/jira/browse/SOLR-14889
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Chris M. Hostetter
>Assignee: Chris M. Hostetter
>Priority: Major
> Attachments: SOLR-14889.patch, SOLR-14889.patch
>
>
> SOLR-14824 ran into windows failures when we switching from using a hardcoded 
> "relative" path to the solrRootPath  to using groovy/project variables to get 
> the path.  the reason for the failures was that the path us used as a 
> variable tempted into {{_config.yml.template}} to build the {{_config.yml}} 
> file, but on windows the path seperater of '\' was being parsed by 
> jekyll/YAML as a string escape character.
> (This wasn't a problem we ran into before, even on windows, prior to the 
> SOLR-14824 changes, because the hardcoded relative path only used '/' 
> delimiters, which (j)ruby was happy to work with, even on windows.
> As Uwe pointed out when hotfixing this...
> {quote}Problem was that backslashes are used to escape strings, but windows 
> paths also have those. Fix was to add StringEscapeUtils, but I don't like 
> this too much. Maybe we find a better solution to make special characters in 
> those properties escaped correctly when used in strings inside templates.
> {quote}
> ...the current fix of using {{StringEscapeUtils.escapeJava}} - only for this 
> one variable -- doesn't really protect other variables that might have 
> special charactes in them down the road, and while "escapeJava" work ok for 
> the "\" issue, it isn't neccessarily consistent with all YAML escapse, which 
> could lead to even weird bugs/cofusion down the road.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14889) improve templated variable escaping in ref-guide _config.yml

2020-09-23 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17201162#comment-17201162
 ] 

Uwe Schindler commented on SOLR-14889:
--

Here is my fix:
 [^SOLR-14889.patch] 

You need to create the expty map first and then populate it with escaped 
properties in doFirst. During configuration, the expand() method gets the empty 
map, which is populated in doFirst.

This is a quick hack; I don't like it. Maybe I have an idea this night.

> improve templated variable escaping in ref-guide _config.yml
> 
>
> Key: SOLR-14889
> URL: https://issues.apache.org/jira/browse/SOLR-14889
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Chris M. Hostetter
>Assignee: Chris M. Hostetter
>Priority: Major
> Attachments: SOLR-14889.patch, SOLR-14889.patch
>
>
> SOLR-14824 ran into windows failures when we switching from using a hardcoded 
> "relative" path to the solrRootPath  to using groovy/project variables to get 
> the path.  the reason for the failures was that the path us used as a 
> variable tempted into {{_config.yml.template}} to build the {{_config.yml}} 
> file, but on windows the path seperater of '\' was being parsed by 
> jekyll/YAML as a string escape character.
> (This wasn't a problem we ran into before, even on windows, prior to the 
> SOLR-14824 changes, because the hardcoded relative path only used '/' 
> delimiters, which (j)ruby was happy to work with, even on windows.
> As Uwe pointed out when hotfixing this...
> {quote}Problem was that backslashes are used to escape strings, but windows 
> paths also have those. Fix was to add StringEscapeUtils, but I don't like 
> this too much. Maybe we find a better solution to make special characters in 
> those properties escaped correctly when used in strings inside templates.
> {quote}
> ...the current fix of using {{StringEscapeUtils.escapeJava}} - only for this 
> one variable -- doesn't really protect other variables that might have 
> special charactes in them down the road, and while "escapeJava" work ok for 
> the "\" issue, it isn't neccessarily consistent with all YAML escapse, which 
> could lead to even weird bugs/cofusion down the road.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14889) improve templated variable escaping in ref-guide _config.yml

2020-09-23 Thread Uwe Schindler (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated SOLR-14889:
-
Attachment: SOLR-14889.patch

> improve templated variable escaping in ref-guide _config.yml
> 
>
> Key: SOLR-14889
> URL: https://issues.apache.org/jira/browse/SOLR-14889
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Chris M. Hostetter
>Assignee: Chris M. Hostetter
>Priority: Major
> Attachments: SOLR-14889.patch, SOLR-14889.patch
>
>
> SOLR-14824 ran into windows failures when we switching from using a hardcoded 
> "relative" path to the solrRootPath  to using groovy/project variables to get 
> the path.  the reason for the failures was that the path us used as a 
> variable tempted into {{_config.yml.template}} to build the {{_config.yml}} 
> file, but on windows the path seperater of '\' was being parsed by 
> jekyll/YAML as a string escape character.
> (This wasn't a problem we ran into before, even on windows, prior to the 
> SOLR-14824 changes, because the hardcoded relative path only used '/' 
> delimiters, which (j)ruby was happy to work with, even on windows.
> As Uwe pointed out when hotfixing this...
> {quote}Problem was that backslashes are used to escape strings, but windows 
> paths also have those. Fix was to add StringEscapeUtils, but I don't like 
> this too much. Maybe we find a better solution to make special characters in 
> those properties escaped correctly when used in strings inside templates.
> {quote}
> ...the current fix of using {{StringEscapeUtils.escapeJava}} - only for this 
> one variable -- doesn't really protect other variables that might have 
> special charactes in them down the road, and while "escapeJava" work ok for 
> the "\" issue, it isn't neccessarily consistent with all YAML escapse, which 
> could lead to even weird bugs/cofusion down the road.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-14889) improve templated variable escaping in ref-guide _config.yml

2020-09-23 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17201157#comment-17201157
 ] 

Uwe Schindler edited comment on SOLR-14889 at 9/23/20, 11:13 PM:
-

That's very easy to explain: The expansion is done when the project is 
configured!

Previously it was working because you just set a pointer to the (still 
changing) props. Here the problem is that the collect loop is running during 
configuration phase and you set a pointer to the result during configuration.

To fix this the whole expand must be delayed using lazy evaluation. It's later, 
will try before going to bed.


was (Author: thetaphi):
That's very easy to explain: The expansion is done when the project is 
configured!

Previously it was working because you just set a pointer to the (still changing 
props). Here the problem is that the collect loop is running during 
configuration phase.

To fix this the whole expand must be delayed using lazy evaluation. It's later, 
will try before going to bed.

> improve templated variable escaping in ref-guide _config.yml
> 
>
> Key: SOLR-14889
> URL: https://issues.apache.org/jira/browse/SOLR-14889
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Chris M. Hostetter
>Assignee: Chris M. Hostetter
>Priority: Major
> Attachments: SOLR-14889.patch
>
>
> SOLR-14824 ran into windows failures when we switching from using a hardcoded 
> "relative" path to the solrRootPath  to using groovy/project variables to get 
> the path.  the reason for the failures was that the path us used as a 
> variable tempted into {{_config.yml.template}} to build the {{_config.yml}} 
> file, but on windows the path seperater of '\' was being parsed by 
> jekyll/YAML as a string escape character.
> (This wasn't a problem we ran into before, even on windows, prior to the 
> SOLR-14824 changes, because the hardcoded relative path only used '/' 
> delimiters, which (j)ruby was happy to work with, even on windows.
> As Uwe pointed out when hotfixing this...
> {quote}Problem was that backslashes are used to escape strings, but windows 
> paths also have those. Fix was to add StringEscapeUtils, but I don't like 
> this too much. Maybe we find a better solution to make special characters in 
> those properties escaped correctly when used in strings inside templates.
> {quote}
> ...the current fix of using {{StringEscapeUtils.escapeJava}} - only for this 
> one variable -- doesn't really protect other variables that might have 
> special charactes in them down the road, and while "escapeJava" work ok for 
> the "\" issue, it isn't neccessarily consistent with all YAML escapse, which 
> could lead to even weird bugs/cofusion down the road.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14889) improve templated variable escaping in ref-guide _config.yml

2020-09-23 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17201157#comment-17201157
 ] 

Uwe Schindler commented on SOLR-14889:
--

That's very easy to explain: The expansion is done when the project is 
configured!

Previously it was working because you just set a pointer to the (still changing 
props). Here the problem is that the collect loop is running during 
configuration phase.

To fix this the whole expand must be delayed using lazy evaluation. It's later, 
will try before going to bed.

> improve templated variable escaping in ref-guide _config.yml
> 
>
> Key: SOLR-14889
> URL: https://issues.apache.org/jira/browse/SOLR-14889
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Chris M. Hostetter
>Assignee: Chris M. Hostetter
>Priority: Major
> Attachments: SOLR-14889.patch
>
>
> SOLR-14824 ran into windows failures when we switching from using a hardcoded 
> "relative" path to the solrRootPath  to using groovy/project variables to get 
> the path.  the reason for the failures was that the path us used as a 
> variable tempted into {{_config.yml.template}} to build the {{_config.yml}} 
> file, but on windows the path seperater of '\' was being parsed by 
> jekyll/YAML as a string escape character.
> (This wasn't a problem we ran into before, even on windows, prior to the 
> SOLR-14824 changes, because the hardcoded relative path only used '/' 
> delimiters, which (j)ruby was happy to work with, even on windows.
> As Uwe pointed out when hotfixing this...
> {quote}Problem was that backslashes are used to escape strings, but windows 
> paths also have those. Fix was to add StringEscapeUtils, but I don't like 
> this too much. Maybe we find a better solution to make special characters in 
> those properties escaped correctly when used in strings inside templates.
> {quote}
> ...the current fix of using {{StringEscapeUtils.escapeJava}} - only for this 
> one variable -- doesn't really protect other variables that might have 
> special charactes in them down the road, and while "escapeJava" work ok for 
> the "\" issue, it isn't neccessarily consistent with all YAML escapse, which 
> could lead to even weird bugs/cofusion down the road.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14889) improve templated variable escaping in ref-guide _config.yml

2020-09-23 Thread Chris M. Hostetter (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris M. Hostetter updated SOLR-14889:
--
Attachment: SOLR-14889.patch
Status: Open  (was: Open)

I thought this would be straight forward, but there's clearly still a lot about 
the gradle lifecycle / order-of-evaluation that i don't udnerstand

The key change in the attached patch that this whole idea hinges on is..
{noformat}
-expand(templateProps)
+expand( templateProps.collectEntries({ k, v -> [k, 
v.replaceAll("'","''")]}) )
{noformat}
But for reasons i don't understand, this seems to bypass the changes made to 
{{templateProps}} in ' {{setupLazyProps.doFirst}} ', where the ivy version 
values are added...
{noformat}
Execution failed for task ':solr:solr-ref-guide:prepareSources'.
> Could not copy file 
> '/home/hossman/lucene/dev/solr/solr-ref-guide/src/_config.yml.template' to 
> '/home/hossman/lucene/dev/solr/solr-ref-guide/build/content/_config.yml'.
   > Missing property (ivyCommonsCodec) for Groovy template expansion. Defined 
keys [javadocLink, solrGuideDraftStatus, solrRootPath, solrDocsVersion, 
solrGuideVersionPath, htmlSolrJavadocs, htmlLuceneJavadocs, buildDate, 
buildYear, out].
{noformat}
(I'm also not clear where that 'out' key is coming from, but i have no idea if 
that pre-dates this change)

I experimented with adding a {{doFirst}} block to {{prepareSources}} that would 
copy the (escaped) templateProps into a newly defined Map in that task, that 
would be used in the {{expand(...)}} call – but that still seemed to result in 
the {{expand(..)}} being evaluated before the {{doFirst}} modified the map (see 
big commented out nocommit block in the patch for what i mean)

[~uschindler] / [~dweiss] - can you help me understand what's going on here and 
how to do this "the right way" ?

> improve templated variable escaping in ref-guide _config.yml
> 
>
> Key: SOLR-14889
> URL: https://issues.apache.org/jira/browse/SOLR-14889
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Chris M. Hostetter
>Assignee: Chris M. Hostetter
>Priority: Major
> Attachments: SOLR-14889.patch
>
>
> SOLR-14824 ran into windows failures when we switching from using a hardcoded 
> "relative" path to the solrRootPath  to using groovy/project variables to get 
> the path.  the reason for the failures was that the path us used as a 
> variable tempted into {{_config.yml.template}} to build the {{_config.yml}} 
> file, but on windows the path seperater of '\' was being parsed by 
> jekyll/YAML as a string escape character.
> (This wasn't a problem we ran into before, even on windows, prior to the 
> SOLR-14824 changes, because the hardcoded relative path only used '/' 
> delimiters, which (j)ruby was happy to work with, even on windows.
> As Uwe pointed out when hotfixing this...
> {quote}Problem was that backslashes are used to escape strings, but windows 
> paths also have those. Fix was to add StringEscapeUtils, but I don't like 
> this too much. Maybe we find a better solution to make special characters in 
> those properties escaped correctly when used in strings inside templates.
> {quote}
> ...the current fix of using {{StringEscapeUtils.escapeJava}} - only for this 
> one variable -- doesn't really protect other variables that might have 
> special charactes in them down the road, and while "escapeJava" work ok for 
> the "\" issue, it isn't neccessarily consistent with all YAML escapse, which 
> could lead to even weird bugs/cofusion down the road.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] goankur commented on a change in pull request #1893: LUCENE-9444 Utility class to get facet labels from taxonomy for a fac…

2020-09-23 Thread GitBox


goankur commented on a change in pull request #1893:
URL: https://github.com/apache/lucene-solr/pull/1893#discussion_r493919329



##
File path: 
lucene/facet/src/test/org/apache/lucene/facet/taxonomy/TestTaxonomyLabels.java
##
@@ -0,0 +1,192 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.facet.taxonomy;
+
+import org.apache.lucene.document.Document;
+import org.apache.lucene.facet.FacetField;
+import org.apache.lucene.facet.FacetTestCase;
+import org.apache.lucene.facet.FacetsCollector;
+import org.apache.lucene.facet.FacetsCollector.MatchingDocs;
+import org.apache.lucene.facet.FacetsConfig;
+import org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyReader;
+import org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter;
+import org.apache.lucene.index.IndexWriterConfig;
+import org.apache.lucene.index.RandomIndexWriter;
+import org.apache.lucene.search.DocIdSetIterator;
+import org.apache.lucene.search.IndexSearcher;
+import org.apache.lucene.search.MatchAllDocsQuery;
+import org.apache.lucene.store.Directory;
+import org.apache.lucene.util.IOUtils;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+public class TestTaxonomyLabels extends FacetTestCase {
+
+  private List prepareDocuments() {
+List docs = new ArrayList<>();
+
+Document doc = new Document();
+doc.add(new FacetField("Author", "Bob"));
+doc.add(new FacetField("Publish Date", "2010", "10", "15"));
+docs.add(doc);
+
+doc = new Document();
+doc.add(new FacetField("Author", "Lisa"));
+doc.add(new FacetField("Publish Date", "2010", "10", "20"));
+docs.add(doc);
+
+doc = new Document();
+doc.add(new FacetField("Author", "Tom"));
+doc.add(new FacetField("Publish Date", "2012", "1", "1"));
+docs.add(doc);
+
+doc = new Document();
+doc.add(new FacetField("Author", "Susan"));
+doc.add(new FacetField("Publish Date", "2012", "1", "7"));
+docs.add(doc);
+
+doc = new Document();
+doc.add(new FacetField("Author", "Frank"));
+doc.add(new FacetField("Publish Date", "1999", "5", "5"));
+docs.add(doc);
+
+return docs;
+  }
+
+  private List allDocIds(MatchingDocs m, boolean decreasingDocIds) 
throws IOException {
+DocIdSetIterator disi = m.bits.iterator();
+List docIds = new ArrayList<>();
+while (disi.nextDoc() != DocIdSetIterator.NO_MORE_DOCS) {
+  docIds.add(disi.docID());
+}
+
+if (decreasingDocIds == true) {
+  Collections.reverse(docIds);
+}
+return docIds;
+  }
+
+  private List lookupFacetLabels(TaxonomyFacetLabels taxoLabels,
+ List matchingDocs) 
throws IOException {
+return lookupFacetLabels(taxoLabels, matchingDocs, null, false);
+  }
+
+  private List lookupFacetLabels(TaxonomyFacetLabels taxoLabels,
+ List matchingDocs,
+ String dimension) throws 
IOException {
+return lookupFacetLabels(taxoLabels, matchingDocs, dimension, false);
+  }
+
+  private List lookupFacetLabels(TaxonomyFacetLabels taxoLabels, 
List matchingDocs, String dimension,
+ boolean decreasingDocIds) throws 
IOException {
+List facetLabels = new ArrayList<>();
+
+for (MatchingDocs m : matchingDocs) {
+  TaxonomyFacetLabels.FacetLabelReader facetLabelReader = 
taxoLabels.getFacetLabelReader(m.context);
+  List docIds = allDocIds(m, decreasingDocIds);
+  FacetLabel facetLabel;
+  for (Integer docId : docIds) {
+while (true) {
+  if (dimension != null) {
+facetLabel = facetLabelReader.nextFacetLabel(docId, dimension);
+  } else {
+facetLabel = facetLabelReader.nextFacetLabel(docId);
+  }
+
+  if (facetLabel == null) {
+break;
+  }
+  facetLabels.add(facetLabel);
+}
+  }
+}
+
+return facetLabels;
+  }
+
+
+  public void testBasic() throws Exception {

Review comment:
   Done in this revision




[GitHub] [lucene-solr] uschindler commented on pull request #1836: LUCENE-9317: Clean up split package in analyzers-common

2020-09-23 Thread GitBox


uschindler commented on pull request #1836:
URL: https://github.com/apache/lucene-solr/pull/1836#issuecomment-697990504


   > Create fake factory base classes in o.a.l.a.util for backward 
compatibility (?)
   
   We do this only in Lucene 9, so more important to add all changes to 
MIGRATE.md
   
   > Fix tests
   
   I mentioned this, as the META-INF/services files are not updated. This makes 
renamed analyzers not load, as SPI can't find them
   As said before we need an SPI load test that ensures that all analyzer 
coponents have a factory that loads successfully with SPI. Maybe move that test 
(abstract) to test-framework and create a test implementation instance for each 
module containing factories. The test in analysis/common is not enough anymore.
   
   > Fix gradle scripts (?)
   
   jflex regenerate may need to be adapted.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13682) command line option to export data to a file

2020-09-23 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17201092#comment-17201092
 ] 

David Smiley commented on SOLR-13682:
-

The ref-guide addition to solr-control-script-reference.adoc is nice, but I was 
unable to find it there as a user.  I only found it using my committer 
sleuthing experience.  My first action as a user was to search the ref guide 
search box for the word "export" which uncovered exporting-result-sets.adoc.  
That page definitely seemed like it was spot-on, yet it didn't have information 
about this new cool tool.  Can you add a link there [~noble.paul]?

> command line option to export data to a file
> 
>
> Key: SOLR-13682
> URL: https://issues.apache.org/jira/browse/SOLR-13682
> Project: Solr
>  Issue Type: Improvement
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
> Fix For: 8.3
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> example
> {code:java}
> bin/solr export -url http://localhost:8983/solr/gettingstarted
> {code}
> This will export all the docs in a collection called {{gettingstarted}} into 
> a file called {{gettingstarted.json}}
> additional options are
>  * {{format}} : {{jsonl}} (default) or {{javabin}}
>  * {{out}} : export file name 
>  * {{query}} : a custom query , default is **:**
>  * {{fields}}: a comma separated list of fields to be exported
>  * {{limit}} : no:of docs. default is 100 , send  {{-1}} to import all the 
> docs
> h2. Importing using {{curl}}
> importing json file
> {code:java}
> curl -X POST -d @gettingstarted.json 
> http://localhost:18983/solr/gettingstarted/update/json/docs?commit=true
> {code}
> importing javabin format file
> {code:java}
> curl -X POST --header "Content-Type: application/javabin" --data-binary 
> @gettingstarted.javabin 
> http://localhost:7574/solr/gettingstarted/update?commit=true
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9537) Add Indri Search Engine Functionality to Lucene

2020-09-23 Thread Cameron VandenBerg (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17201089#comment-17201089
 ] 

Cameron VandenBerg commented on LUCENE-9537:


Hi Adrien,

Unfortunately, the smoothing score that we use is document specific, so I am 
not sure if I could make it "transferable".  I am definitely interested in 
brainstorming ways that we can make Indri fit into the Lucene architecture 
better though.  Perhaps an example of how Indri smoothing scores would be 
helpful.

 

Supposed we have an index with 4 documents (so sorry for the political nature 
of the documents... it's just what I can easily think of at the moment):

1) Donald Trump is the president of the United States.

2) There are three branches of government.  The president is the head of the 
executive branch.

3) Jane Doe is president of the PTO.

4) Trump was elected in the 2016 election.

 

Say that the query is: President Trump.

In this index, the term president occurs more than the term Trump.  The 
smoothing score acts like and idf for the query terms so that documents with 
just the term Trump will be ranked higher than documents with just the term 
president.

 

Consider documents 3&4, which have the same length and each have one search 
term, but Document 4 has the more rare search term.  Therefore the smoothing 
score for the term Trump in Document 3, will be lower than the smoothing score 
for the term president in Document 4.  The addition of the smoothing scores for 
the terms that don't exist allows Document 4 to get a higher score and be 
ranked above Document 3.  

 

Let me know whether this example makes sense.  Can you see a way that I can 
refactor the smoothing score so that it better fits into Lucene's existing 
architecture?  Or let me know if I misunderstood your comment and you still 
feel that what you suggested will work.

 

Thank you!

> Add Indri Search Engine Functionality to Lucene
> ---
>
> Key: LUCENE-9537
> URL: https://issues.apache.org/jira/browse/LUCENE-9537
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: Cameron VandenBerg
>Priority: Major
>  Labels: patch
> Attachments: LUCENE-INDRI.patch
>
>
> Indri ([http://lemurproject.org/indri.php]) is an academic search engine 
> developed by The University of Massachusetts and Carnegie Mellon University.  
> The major difference between Lucene and Indri is that Indri will give a 
> document a "smoothing score" to a document that does not contain the search 
> term, which has improved the search ranking accuracy in our experiments.  I 
> have created an Indri patch, which adds the search code needed to implement 
> the Indri AND logic as well as Indri's implementation of Dirichlet Smoothing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] madrob commented on a change in pull request #1905: LUCENE-9488 Release with Gradle Part 2

2020-09-23 Thread GitBox


madrob commented on a change in pull request #1905:
URL: https://github.com/apache/lucene-solr/pull/1905#discussion_r493884949



##
File path: lucene/build.gradle
##
@@ -15,8 +15,56 @@
  * limitations under the License.
  */
 
+// Should we do this as :lucene:packaging similar to how Solr does it?
+// Or is this fine here?
+
+plugins {
+  id 'distribution'
+}
+
 description = 'Parent project for Apache Lucene Core'
 
 subprojects {
   group "org.apache.lucene"
-}
\ No newline at end of file
+}
+
+distributions {
+  main {
+  // This is empirically wrong, but it is mostly a copy from `ant 
package-zip`

Review comment:
   My goal here with getting things releasable is to also turn the smoke 
tester back on so that we can hopefully catch issues before we actually go to 
do the release. I understand there’s going to be more split related work, but 
that shouldn’t stop us from working on the pieces that we can work on before 
that. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14354) HttpShardHandler send requests in async

2020-09-23 Thread Ishan Chattopadhyaya (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17201067#comment-17201067
 ] 

Ishan Chattopadhyaya commented on SOLR-14354:
-

This doesn't have associated performance benchmarks for 8.7. 

bq. Would you recommend reverting from 8x? I'm not sure; it hasn't been shown 
to cause test failures that we can attribute here so seems safe from that end. 
At least where I work, it's something we'll use in our 8x fork and can serve as 
a canary.
We need to stop treating our users as guinea pigs.

-1 for 8.7 unless this is somehow made optional or there are performance 
benchmarks to prove its efficiency.

> HttpShardHandler send requests in async
> ---
>
> Key: SOLR-14354
> URL: https://issues.apache.org/jira/browse/SOLR-14354
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Blocker
> Fix For: master (9.0), 8.7
>
> Attachments: image-2020-03-23-10-04-08-399.png, 
> image-2020-03-23-10-09-10-221.png, image-2020-03-23-10-12-00-661.png
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> h2. 1. Current approach (problem) of Solr
> Below is the diagram describe the model on how currently handling a request.
> !image-2020-03-23-10-04-08-399.png!
> The main-thread that handles the search requests, will submit n requests (n 
> equals to number of shards) to an executor. So each request will correspond 
> to a thread, after sending a request that thread basically do nothing just 
> waiting for response from other side. That thread will be swapped out and CPU 
> will try to handle another thread (this is called context switch, CPU will 
> save the context of the current thread and switch to another one). When some 
> data (not all) come back, that thread will be called to parsing these data, 
> then it will wait until more data come back. So there will be lots of context 
> switching in CPU. That is quite inefficient on using threads.Basically we 
> want less threads and most of them must busy all the time, because threads 
> are not free as well as context switching. That is the main idea behind 
> everything, like executor
> h2. 2. Async call of Jetty HttpClient
> Jetty HttpClient offers async API like this.
> {code:java}
> httpClient.newRequest("http://domain.com/path;)
> // Add request hooks
> .onRequestQueued(request -> { ... })
> .onRequestBegin(request -> { ... })
> // Add response hooks
> .onResponseBegin(response -> { ... })
> .onResponseHeaders(response -> { ... })
> .onResponseContent((response, buffer) -> { ... })
> .send(result -> { ... }); {code}
> Therefore after calling {{send()}} the thread will return immediately without 
> any block. Then when the client received the header from other side, it will 
> call {{onHeaders()}} listeners. When the client received some {{byte[]}} (not 
> all response) from the data it will call {{onContent(buffer)}} listeners. 
> When everything finished it will call {{onComplete}} listeners. One main 
> thing that will must notice here is all listeners should finish quick, if the 
> listener block, all further data of that request won’t be handled until the 
> listener finish.
> h2. 3. Solution 1: Sending requests async but spin one thread per response
>  Jetty HttpClient already provides several listeners, one of them is 
> InputStreamResponseListener. This is how it is get used
> {code:java}
> InputStreamResponseListener listener = new InputStreamResponseListener();
> client.newRequest(...).send(listener);
> // Wait for the response headers to arrive
> Response response = listener.get(5, TimeUnit.SECONDS);
> if (response.getStatus() == 200) {
>   // Obtain the input stream on the response content
>   try (InputStream input = listener.getInputStream()) {
> // Read the response content
>   }
> } {code}
> In this case, there will be 2 thread
>  * one thread trying to read the response content from InputStream
>  * one thread (this is a short-live task) feeding content to above 
> InputStream whenever some byte[] is available. Note that if this thread 
> unable to feed data into InputStream, this thread will wait.
> By using this one, the model of HttpShardHandler can be written into 
> something like this
> {code:java}
> handler.sendReq(req, (is) -> {
>   executor.submit(() ->
> try (is) {
>   // Read the content from InputStream
> }
>   )
> }) {code}
>  The first diagram will be changed into this
> !image-2020-03-23-10-09-10-221.png!
> Notice that although “sending req to shard1” is wide, it won’t take long time 
> since sending req is a very quick operation. With this operation, handling 
> threads won’t be spin up until first bytes are sent back. Notice that in this 
> 

[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1900: SOLR-14036: Remove explicit distrib=false from /terms handler

2020-09-23 Thread GitBox


dsmiley commented on a change in pull request #1900:
URL: https://github.com/apache/lucene-solr/pull/1900#discussion_r493862362



##
File path: solr/solr-ref-guide/src/major-changes-in-solr-9.adoc
##
@@ -128,6 +128,8 @@ _(raw; not yet edited)_
 * SOLR-14510: The `writeStartDocumentList` in `TextResponseWriter` now 
receives an extra boolean parameter representing the "exactness" of the 
numFound value (exact vs approximation).
   Any custom response writer extending `TextResponseWriter` will need to 
implement this abstract method now (instead previous with the same name but 
without the new boolean parameter).
 
+* SOLR-14036: Implicit /terms handler now supports distributed search by 
default, when running in cloud mode.

Review comment:
   Reworded to help a user think through upgrading:
   ```suggestion
   * SOLR-14036: Implicit /terms handler now returns terms across all shards in 
SolrCloud instead of only the local core.  Users/apps may be assuming the old 
behavior.  A request can be modified via the standard distrib=false param to 
only use the local core receiving the request.
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dweiss commented on a change in pull request #1905: LUCENE-9488 Release with Gradle Part 2

2020-09-23 Thread GitBox


dweiss commented on a change in pull request #1905:
URL: https://github.com/apache/lucene-solr/pull/1905#discussion_r493853135



##
File path: lucene/build.gradle
##
@@ -15,8 +15,56 @@
  * limitations under the License.
  */
 
+// Should we do this as :lucene:packaging similar to how Solr does it?
+// Or is this fine here?
+
+plugins {
+  id 'distribution'
+}
+
 description = 'Parent project for Apache Lucene Core'
 
 subprojects {
   group "org.apache.lucene"
-}
\ No newline at end of file
+}
+
+distributions {
+  main {
+  // This is empirically wrong, but it is mostly a copy from `ant 
package-zip`

Review comment:
   Haven't forgotten about it, just busy with work. Those release scripts 
will have to be adjusted to Solr and Lucene released independently in the 
future. Which requires independent builds, which requires repo split. Will have 
to get to it, eventually. Sigh.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dweiss commented on pull request #1836: LUCENE-9317: Clean up split package in analyzers-common

2020-09-23 Thread GitBox


dweiss commented on pull request #1836:
URL: https://github.com/apache/lucene-solr/pull/1836#issuecomment-697932731


   Hi Tomoko. The patch looks good to me (precommit doesn't pass though). I 
would commit it in once you get precommit to work - this issue has been out 
there for a while, nobody objected. If there is a need for changes (on master), 
we'll just follow-up.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14892) shards.info with shards.tolerant can yield an empty key

2020-09-23 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17201038#comment-17201038
 ] 

David Smiley commented on SOLR-14892:
-

I chased this down to 
org.apache.solr.handler.component.HttpShardHandler#createSliceShardsStr which 
when given an empty list, returns an empty string. It should probably return 
null.  But using  But null has ripple effects in many places which assume 
non-null values and maybe were written without shards.tolerant in mind. Lets 
say it remains an empty string. SearchHandler.handleRequestBody loops over 
"sreq.actualShards" which can yield that empty string. I hoped simply 
"continue"-ing this loop on this occurrence may help but it led to some other 
mystery. The code involved in general here is awfully messy.

> shards.info with shards.tolerant can yield an empty key
> ---
>
> Key: SOLR-14892
> URL: https://issues.apache.org/jira/browse/SOLR-14892
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: search
>Reporter: David Smiley
>Priority: Minor
> Attachments: solr14892.png
>
>
> When using shards.tolerant=true and shards.info=true when a shard isn't 
> available (and maybe other circumstances), the shards.info section of the 
> response may have an empty-string key child with a value that is ambiguous as 
> to which shard(s) couldn't be reached.
> This problem can be revealed by modifying 
> org.apache.solr.cloud.TestDownShardTolerantSearch#searchingShouldFailWithoutTolerantSearchSetToTrue
>  to add shards.info and then examine the response in a debugger.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14893) Allow UpdateRequestProcessors to add non-error messages to the response

2020-09-23 Thread Houston Putman (Jira)
Houston Putman created SOLR-14893:
-

 Summary: Allow UpdateRequestProcessors to add non-error messages 
to the response
 Key: SOLR-14893
 URL: https://issues.apache.org/jira/browse/SOLR-14893
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
  Components: UpdateRequestProcessors
Reporter: Houston Putman


There are many reasons why a UpdateRequestProcessor would want to send a 
response back to the user:
 * Informing the user on the results when they use schema-guessing mode 
(SOLR-14701)
 * Building a new Processor that uses the lucene monitor library to alert on 
incoming documents that match saved queries
 * The Language detection URPs could respond with the languages selected for 
each document.

Currently URPs can be passed in the Response object via the URPFactory that 
creates it. However, whenever the URP is placed in the chain after the 
DistributedURP, the response that it sends back will be dismissed by the DURP 
and not merged and sent back to the user.

The bulk of the logic here would be to add logic in the DURP to accept custom 
messages in the responses of the updates it sends, and then merge those into an 
overall response to send to the user. Each URP could be responsible for merging 
its section of responses, because that will likely contain business logic for 
the URP that the DURP is not aware of.

 

The SolrJ classes would also need updates to give the user an easy way to read 
response messages.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14892) shards.info with shards.tolerant can yield an empty key

2020-09-23 Thread David Smiley (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley updated SOLR-14892:

Attachment: solr14892.png

> shards.info with shards.tolerant can yield an empty key
> ---
>
> Key: SOLR-14892
> URL: https://issues.apache.org/jira/browse/SOLR-14892
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: search
>Reporter: David Smiley
>Priority: Minor
> Attachments: solr14892.png
>
>
> When using shards.tolerant=true and shards.info=true when a shard isn't 
> available (and maybe other circumstances), the shards.info section of the 
> response may have an empty-string key child with a value that is ambiguous as 
> to which shard(s) couldn't be reached.
> This problem can be revealed by modifying 
> org.apache.solr.cloud.TestDownShardTolerantSearch#searchingShouldFailWithoutTolerantSearchSetToTrue
>  to add shards.info and then examine the response in a debugger.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14892) shards.info with shards.tolerant can yield an empty key

2020-09-23 Thread David Smiley (Jira)
David Smiley created SOLR-14892:
---

 Summary: shards.info with shards.tolerant can yield an empty key
 Key: SOLR-14892
 URL: https://issues.apache.org/jira/browse/SOLR-14892
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: search
Reporter: David Smiley


When using shards.tolerant=true and shards.info=true when a shard isn't 
available (and maybe other circumstances), the shards.info section of the 
response may have an empty-string key child with a value that is ambiguous as 
to which shard(s) couldn't be reached.

This problem can be revealed by modifying 
org.apache.solr.cloud.TestDownShardTolerantSearch#searchingShouldFailWithoutTolerantSearchSetToTrue
 to add shards.info and then examine the response in a debugger.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] s1monw opened a new pull request #1918: LUCENE-9535: Commit DWPT bytes used before locking indexing

2020-09-23 Thread GitBox


s1monw opened a new pull request #1918:
URL: https://github.com/apache/lucene-solr/pull/1918


   Currently we calcualte the ramBytesUsed by the DWPT under the flushControl
   lock. We can do this caculation safely outside of the lock without any 
downside.
   The FlushControl lock should be used with care since it's a central part of 
indexing
   and might block all indexing.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] arafalov commented on pull request #1863: SOLR-14701: GuessSchemaFields URP to replace AddSchemaFields URP in schemaless mode

2020-09-23 Thread GitBox


arafalov commented on pull request #1863:
URL: https://github.com/apache/lucene-solr/pull/1863#issuecomment-697846565


   > Purely responding to the URP response part, it’s definitely not possible 
for URP to send non-error responses. I do think its something we should 
implement though, since it will expand the use cases that URPs can solve. Ill 
create a JIRA for it.
   
   It may be possible to future proof this implementation by making 
**guess-schema** being a mode switch, instead of current present/absent flag. 
So, maybe rename it to **guess-mode** instead with options of 
   - **update** - current (only) option basically, 
   - **show** - (if/when there is a way to return suggested JSON), 
   - **update-all** - (if we wanted to - sometimes - have specific fields even 
if dynamicField definition matches; could be done now if useful, 
   - **none** to support tools easier. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9535) Investigate recent indexing slowdown for wikimedium documents

2020-09-23 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17201012#comment-17201012
 ] 

ASF subversion and git services commented on LUCENE-9535:
-

Commit d226abd4481a5bd837264a7c53d1b13f417842ad in lucene-solr's branch 
refs/heads/master from Adrien Grand
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=d226abd ]

LUCENE-9535: Make ByteBuffersDataOutput#ramBytesUsed run in constant-time. 
(#1917)



> Investigate recent indexing slowdown for wikimedium documents
> -
>
> Key: LUCENE-9535
> URL: https://issues.apache.org/jira/browse/LUCENE-9535
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: cpu_profile.svg
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Nightly benchmarks report a ~10% slowdown for 1kB documents as of September 
> 9th: [http://people.apache.org/~mikemccand/lucenebench/indexing.html].
> On that day, we added stored fields in DWPT accounting (LUCENE-9511), so I 
> first thought this could be due to smaller flushed segments and more merging, 
> but I still wonder whether there's something else. The benchmark runs with 
> 8GB of heap, 2GB of RAM buffer and 36 indexing threads. So it's about 2GB/36 
> = 57MB of RAM buffer per thread in the worst-case scenario that all DWPTs get 
> full at the same time. Stored fields account for about 0.7MB of memory, or 1% 
> of the indexing buffer size. How can a 1% reduction of buffering capacity 
> explain a 10% indexing slowdown? I looked into this further by running 
> indexing benchmarks locally with 8 indexing threads and 128MB of indexing 
> buffer memory, which would make this issue even more apparent if the smaller 
> RAM buffer was the cause, but I'm not seeing a regression and actually I'm 
> seeing similar number of flushes when I disabled memory accounting for stored 
> fields.
> I ran indexing under a profiler to see whether something else could cause 
> this slowdown, e.g. slow implementations of ramBytesUsed on stored fields 
> writers, but nothing surprising showed up and the profile looked just like I 
> would have expected.
> Another question I have is why the 4kB benchmark is not affected at all.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9535) Investigate recent indexing slowdown for wikimedium documents

2020-09-23 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17201013#comment-17201013
 ] 

ASF subversion and git services commented on LUCENE-9535:
-

Commit a83c2c2ab00fea84ea48053a53276db905f05000 in lucene-solr's branch 
refs/heads/branch_8x from Adrien Grand
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=a83c2c2 ]

LUCENE-9535: Make ByteBuffersDataOutput#ramBytesUsed run in constant-time. 
(#1917)



> Investigate recent indexing slowdown for wikimedium documents
> -
>
> Key: LUCENE-9535
> URL: https://issues.apache.org/jira/browse/LUCENE-9535
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: cpu_profile.svg
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Nightly benchmarks report a ~10% slowdown for 1kB documents as of September 
> 9th: [http://people.apache.org/~mikemccand/lucenebench/indexing.html].
> On that day, we added stored fields in DWPT accounting (LUCENE-9511), so I 
> first thought this could be due to smaller flushed segments and more merging, 
> but I still wonder whether there's something else. The benchmark runs with 
> 8GB of heap, 2GB of RAM buffer and 36 indexing threads. So it's about 2GB/36 
> = 57MB of RAM buffer per thread in the worst-case scenario that all DWPTs get 
> full at the same time. Stored fields account for about 0.7MB of memory, or 1% 
> of the indexing buffer size. How can a 1% reduction of buffering capacity 
> explain a 10% indexing slowdown? I looked into this further by running 
> indexing benchmarks locally with 8 indexing threads and 128MB of indexing 
> buffer memory, which would make this issue even more apparent if the smaller 
> RAM buffer was the cause, but I'm not seeing a regression and actually I'm 
> seeing similar number of flushes when I disabled memory accounting for stored 
> fields.
> I ran indexing under a profiler to see whether something else could cause 
> this slowdown, e.g. slow implementations of ramBytesUsed on stored fields 
> writers, but nothing surprising showed up and the profile looked just like I 
> would have expected.
> Another question I have is why the 4kB benchmark is not affected at all.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9535) Investigate recent indexing slowdown for wikimedium documents

2020-09-23 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17201010#comment-17201010
 ] 

ASF subversion and git services commented on LUCENE-9535:
-

Commit d226abd4481a5bd837264a7c53d1b13f417842ad in lucene-solr's branch 
refs/heads/master from Adrien Grand
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=d226abd ]

LUCENE-9535: Make ByteBuffersDataOutput#ramBytesUsed run in constant-time. 
(#1917)



> Investigate recent indexing slowdown for wikimedium documents
> -
>
> Key: LUCENE-9535
> URL: https://issues.apache.org/jira/browse/LUCENE-9535
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: cpu_profile.svg
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Nightly benchmarks report a ~10% slowdown for 1kB documents as of September 
> 9th: [http://people.apache.org/~mikemccand/lucenebench/indexing.html].
> On that day, we added stored fields in DWPT accounting (LUCENE-9511), so I 
> first thought this could be due to smaller flushed segments and more merging, 
> but I still wonder whether there's something else. The benchmark runs with 
> 8GB of heap, 2GB of RAM buffer and 36 indexing threads. So it's about 2GB/36 
> = 57MB of RAM buffer per thread in the worst-case scenario that all DWPTs get 
> full at the same time. Stored fields account for about 0.7MB of memory, or 1% 
> of the indexing buffer size. How can a 1% reduction of buffering capacity 
> explain a 10% indexing slowdown? I looked into this further by running 
> indexing benchmarks locally with 8 indexing threads and 128MB of indexing 
> buffer memory, which would make this issue even more apparent if the smaller 
> RAM buffer was the cause, but I'm not seeing a regression and actually I'm 
> seeing similar number of flushes when I disabled memory accounting for stored 
> fields.
> I ran indexing under a profiler to see whether something else could cause 
> this slowdown, e.g. slow implementations of ramBytesUsed on stored fields 
> writers, but nothing surprising showed up and the profile looked just like I 
> would have expected.
> Another question I have is why the 4kB benchmark is not affected at all.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9535) Investigate recent indexing slowdown for wikimedium documents

2020-09-23 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17201011#comment-17201011
 ] 

ASF subversion and git services commented on LUCENE-9535:
-

Commit a83c2c2ab00fea84ea48053a53276db905f05000 in lucene-solr's branch 
refs/heads/branch_8x from Adrien Grand
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=a83c2c2 ]

LUCENE-9535: Make ByteBuffersDataOutput#ramBytesUsed run in constant-time. 
(#1917)



> Investigate recent indexing slowdown for wikimedium documents
> -
>
> Key: LUCENE-9535
> URL: https://issues.apache.org/jira/browse/LUCENE-9535
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: cpu_profile.svg
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Nightly benchmarks report a ~10% slowdown for 1kB documents as of September 
> 9th: [http://people.apache.org/~mikemccand/lucenebench/indexing.html].
> On that day, we added stored fields in DWPT accounting (LUCENE-9511), so I 
> first thought this could be due to smaller flushed segments and more merging, 
> but I still wonder whether there's something else. The benchmark runs with 
> 8GB of heap, 2GB of RAM buffer and 36 indexing threads. So it's about 2GB/36 
> = 57MB of RAM buffer per thread in the worst-case scenario that all DWPTs get 
> full at the same time. Stored fields account for about 0.7MB of memory, or 1% 
> of the indexing buffer size. How can a 1% reduction of buffering capacity 
> explain a 10% indexing slowdown? I looked into this further by running 
> indexing benchmarks locally with 8 indexing threads and 128MB of indexing 
> buffer memory, which would make this issue even more apparent if the smaller 
> RAM buffer was the cause, but I'm not seeing a regression and actually I'm 
> seeing similar number of flushes when I disabled memory accounting for stored 
> fields.
> I ran indexing under a profiler to see whether something else could cause 
> this slowdown, e.g. slow implementations of ramBytesUsed on stored fields 
> writers, but nothing surprising showed up and the profile looked just like I 
> would have expected.
> Another question I have is why the 4kB benchmark is not affected at all.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9535) Investigate recent indexing slowdown for wikimedium documents

2020-09-23 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17201008#comment-17201008
 ] 

ASF subversion and git services commented on LUCENE-9535:
-

Commit d226abd4481a5bd837264a7c53d1b13f417842ad in lucene-solr's branch 
refs/heads/master from Adrien Grand
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=d226abd ]

LUCENE-9535: Make ByteBuffersDataOutput#ramBytesUsed run in constant-time. 
(#1917)



> Investigate recent indexing slowdown for wikimedium documents
> -
>
> Key: LUCENE-9535
> URL: https://issues.apache.org/jira/browse/LUCENE-9535
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: cpu_profile.svg
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Nightly benchmarks report a ~10% slowdown for 1kB documents as of September 
> 9th: [http://people.apache.org/~mikemccand/lucenebench/indexing.html].
> On that day, we added stored fields in DWPT accounting (LUCENE-9511), so I 
> first thought this could be due to smaller flushed segments and more merging, 
> but I still wonder whether there's something else. The benchmark runs with 
> 8GB of heap, 2GB of RAM buffer and 36 indexing threads. So it's about 2GB/36 
> = 57MB of RAM buffer per thread in the worst-case scenario that all DWPTs get 
> full at the same time. Stored fields account for about 0.7MB of memory, or 1% 
> of the indexing buffer size. How can a 1% reduction of buffering capacity 
> explain a 10% indexing slowdown? I looked into this further by running 
> indexing benchmarks locally with 8 indexing threads and 128MB of indexing 
> buffer memory, which would make this issue even more apparent if the smaller 
> RAM buffer was the cause, but I'm not seeing a regression and actually I'm 
> seeing similar number of flushes when I disabled memory accounting for stored 
> fields.
> I ran indexing under a profiler to see whether something else could cause 
> this slowdown, e.g. slow implementations of ramBytesUsed on stored fields 
> writers, but nothing surprising showed up and the profile looked just like I 
> would have expected.
> Another question I have is why the 4kB benchmark is not affected at all.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jpountz merged pull request #1917: LUCENE-9535: Make ByteBuffersDataOutput#ramBytesUsed run in constant-time.

2020-09-23 Thread GitBox


jpountz merged pull request #1917:
URL: https://github.com/apache/lucene-solr/pull/1917


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9535) Investigate recent indexing slowdown for wikimedium documents

2020-09-23 Thread Adrien Grand (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17200946#comment-17200946
 ] 

Adrien Grand commented on LUCENE-9535:
--

I might have found something. When profiling indexing I noticed some contention 
in {{DocumentsWriterFlushControl#doAfterDocument}}, which happens to 
transitively call {{IndexingChain#ramBytesUsed}}, which was changed in 
LUCENE-9511 to call {{StoredFieldsWriter#ramBytesUsed}}. And 
{{StoredFieldsWriter#ramBytesUsed}} calls 
{{ByteBuffersDataOutput#ramBytesUsed}} which is a bit slow since it iterates 
over all pages. So we might have increased contention on 
{{DocumentsWriterFlushControl#doAfterDocument}} in LUCENE-9511, and this is 
only noticeable on Mike's beast because of the very high number of indexing 
threads (36). I opened https://github.com/apache/lucene-solr/pull/1917.

> Investigate recent indexing slowdown for wikimedium documents
> -
>
> Key: LUCENE-9535
> URL: https://issues.apache.org/jira/browse/LUCENE-9535
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: cpu_profile.svg
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Nightly benchmarks report a ~10% slowdown for 1kB documents as of September 
> 9th: [http://people.apache.org/~mikemccand/lucenebench/indexing.html].
> On that day, we added stored fields in DWPT accounting (LUCENE-9511), so I 
> first thought this could be due to smaller flushed segments and more merging, 
> but I still wonder whether there's something else. The benchmark runs with 
> 8GB of heap, 2GB of RAM buffer and 36 indexing threads. So it's about 2GB/36 
> = 57MB of RAM buffer per thread in the worst-case scenario that all DWPTs get 
> full at the same time. Stored fields account for about 0.7MB of memory, or 1% 
> of the indexing buffer size. How can a 1% reduction of buffering capacity 
> explain a 10% indexing slowdown? I looked into this further by running 
> indexing benchmarks locally with 8 indexing threads and 128MB of indexing 
> buffer memory, which would make this issue even more apparent if the smaller 
> RAM buffer was the cause, but I'm not seeing a regression and actually I'm 
> seeing similar number of flushes when I disabled memory accounting for stored 
> fields.
> I ran indexing under a profiler to see whether something else could cause 
> this slowdown, e.g. slow implementations of ramBytesUsed on stored fields 
> writers, but nothing surprising showed up and the profile looked just like I 
> would have expected.
> Another question I have is why the 4kB benchmark is not affected at all.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jpountz opened a new pull request #1917: LUCENE-9535: Make ByteBuffersDataOutput#ramBytesUsed run in constant-time.

2020-09-23 Thread GitBox


jpountz opened a new pull request #1917:
URL: https://github.com/apache/lucene-solr/pull/1917


   This is called transitively from 
`DocumentsWriterFlushControl#doAfterDocument` which is synchronized and appears 
to be a point of contention.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-8281) Add RollupMergeStream to Streaming API

2020-09-23 Thread Joel Bernstein (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-8281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17200886#comment-17200886
 ] 

Joel Bernstein commented on SOLR-8281:
--

[~gus], feel free to send me an email to discuss.

> Add RollupMergeStream to Streaming API
> --
>
> Key: SOLR-8281
> URL: https://issues.apache.org/jira/browse/SOLR-8281
> Project: Solr
>  Issue Type: Bug
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>Priority: Major
>
> The RollupMergeStream merges the aggregate results emitted by the 
> RollupStream on *worker* nodes.
> This is designed to be used in conjunction with the HashJoinStream to perform 
> rollup Aggregations on the joined Tuples. The HashJoinStream will require the 
> tuples to be partitioned on the Join keys. To avoid needing to repartition on 
> the *group by* fields for the RollupStream, we can perform a merge of the 
> rolled up Tuples coming from the workers.
> The construct would like this:
> {code}
> mergeRollup (...
>   parallel (...
> rollup (...
> hashJoin (
>   search(...),
>   search(...),
>   on="fieldA" 
> )
>  )
>  )
>)
> {code}
> The pseudo code above would push the *hashJoin* and *rollup* to the *worker* 
> nodes. The emitted rolled up tuples would be merged by the mergeRollup.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] madrob merged pull request #1916: Fix minor typo

2020-09-23 Thread GitBox


madrob merged pull request #1916:
URL: https://github.com/apache/lucene-solr/pull/1916


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] madrob commented on pull request #1916: Fix minor typo

2020-09-23 Thread GitBox


madrob commented on pull request #1916:
URL: https://github.com/apache/lucene-solr/pull/1916#issuecomment-697511015


   Thank you for finding and correcting this!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] Hronom commented on a change in pull request #1864: SOLR-14850 ExactStatsCache NullPointerException when shards.tolerant=true

2020-09-23 Thread GitBox


Hronom commented on a change in pull request #1864:
URL: https://github.com/apache/lucene-solr/pull/1864#discussion_r493662797



##
File path: solr/core/src/java/org/apache/solr/search/stats/ExactStatsCache.java
##
@@ -94,6 +94,12 @@ protected ShardRequest 
doRetrieveStatsRequest(ResponseBuilder rb) {
   protected void doMergeToGlobalStats(SolrQueryRequest req, 
List responses) {
 Set allTerms = new HashSet<>();
 for (ShardResponse r : responses) {
+  if 
("true".equalsIgnoreCase(req.getParams().get(ShardParams.SHARDS_TOLERANT)) && 
r.getException() != null) {

Review comment:
   @sigram @madrob I added test that reproduces the problem in 
`TestExactStatsCache`. And it fails with null exception if you remove my fix.
   
   Please can you adjust it(if needed) to nicely fit in solr tests suits, I set 
now `Allow edits by maintainers`.
   
   The trick here with this issue, that its reproducible only when at least one 
shard is fully down(no healthy replica there). This is why I didn't use 
`setDistributedParams`, since it's add's one work replica, so all shards is 
healthy and there no situation when one shard is completely down.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] Hronom commented on a change in pull request #1864: SOLR-14850 ExactStatsCache NullPointerException when shards.tolerant=true

2020-09-23 Thread GitBox


Hronom commented on a change in pull request #1864:
URL: https://github.com/apache/lucene-solr/pull/1864#discussion_r493662797



##
File path: solr/core/src/java/org/apache/solr/search/stats/ExactStatsCache.java
##
@@ -94,6 +94,12 @@ protected ShardRequest 
doRetrieveStatsRequest(ResponseBuilder rb) {
   protected void doMergeToGlobalStats(SolrQueryRequest req, 
List responses) {
 Set allTerms = new HashSet<>();
 for (ShardResponse r : responses) {
+  if 
("true".equalsIgnoreCase(req.getParams().get(ShardParams.SHARDS_TOLERANT)) && 
r.getException() != null) {

Review comment:
   @sigram @madrob I added test that reproduces the problem in 
`TestExactStatsCache`.
   
   Please can you adjust it(if needed) to nicely fit in solr tests suits.
   
   The trick here with this issue, that its reproducible only when at least one 
shard is fully down(no healthy replica there). This is why I didn't use 
`setDistributedParams`, since it's add's one work replica, so all shards is 
healthy and there no situation when one shard is completely down.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] Hronom commented on a change in pull request #1864: SOLR-14850 ExactStatsCache NullPointerException when shards.tolerant=true

2020-09-23 Thread GitBox


Hronom commented on a change in pull request #1864:
URL: https://github.com/apache/lucene-solr/pull/1864#discussion_r493662797



##
File path: solr/core/src/java/org/apache/solr/search/stats/ExactStatsCache.java
##
@@ -94,6 +94,12 @@ protected ShardRequest 
doRetrieveStatsRequest(ResponseBuilder rb) {
   protected void doMergeToGlobalStats(SolrQueryRequest req, 
List responses) {
 Set allTerms = new HashSet<>();
 for (ShardResponse r : responses) {
+  if 
("true".equalsIgnoreCase(req.getParams().get(ShardParams.SHARDS_TOLERANT)) && 
r.getException() != null) {

Review comment:
   @sigram @madrob I added test that reproduces the problem in 
`TestExactStatsCache`. And it fails with null exception if you remove my fix.
   
   Please can you adjust it(if needed) to nicely fit in solr tests suits.
   
   The trick here with this issue, that its reproducible only when at least one 
shard is fully down(no healthy replica there). This is why I didn't use 
`setDistributedParams`, since it's add's one work replica, so all shards is 
healthy and there no situation when one shard is completely down.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] Hronom commented on a change in pull request #1864: SOLR-14850 ExactStatsCache NullPointerException when shards.tolerant=true

2020-09-23 Thread GitBox


Hronom commented on a change in pull request #1864:
URL: https://github.com/apache/lucene-solr/pull/1864#discussion_r493662797



##
File path: solr/core/src/java/org/apache/solr/search/stats/ExactStatsCache.java
##
@@ -94,6 +94,12 @@ protected ShardRequest 
doRetrieveStatsRequest(ResponseBuilder rb) {
   protected void doMergeToGlobalStats(SolrQueryRequest req, 
List responses) {
 Set allTerms = new HashSet<>();
 for (ShardResponse r : responses) {
+  if 
("true".equalsIgnoreCase(req.getParams().get(ShardParams.SHARDS_TOLERANT)) && 
r.getException() != null) {

Review comment:
   @sigram @madrob I added test that reproduces the problem in 
`TestExactStatsCache`.
   
   Please can you adjust it(if needed) to nicely fit in solr tests suits.
   
   The trick here with this issue, that is reproducible only when at least one 
shard is fully down. This is why I didn't use `setDistributedParams`, since 
it's add's one work replica, so all shards is healthy and there no situation 
when one shard is completely down.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] Hronom commented on a change in pull request #1864: SOLR-14850 ExactStatsCache NullPointerException when shards.tolerant=true

2020-09-23 Thread GitBox


Hronom commented on a change in pull request #1864:
URL: https://github.com/apache/lucene-solr/pull/1864#discussion_r493662797



##
File path: solr/core/src/java/org/apache/solr/search/stats/ExactStatsCache.java
##
@@ -94,6 +94,12 @@ protected ShardRequest 
doRetrieveStatsRequest(ResponseBuilder rb) {
   protected void doMergeToGlobalStats(SolrQueryRequest req, 
List responses) {
 Set allTerms = new HashSet<>();
 for (ShardResponse r : responses) {
+  if 
("true".equalsIgnoreCase(req.getParams().get(ShardParams.SHARDS_TOLERANT)) && 
r.getException() != null) {

Review comment:
   @sigram @madrob I added test that reproduces the problem in 
`TestExactStatsCache`.
   
   Please can you adjust it(if needed) to nicely feet in solr tests suits.
   
   The trick here with this issue, that is reproducible only when at least one 
shard is fully down. This is why I didn't use `setDistributedParams`, since 
it's add's one work replica, so all shards is healthy and there no situation 
when one shard is completely down.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dalbani opened a new pull request #1916: Fix minor typo

2020-09-23 Thread GitBox


dalbani opened a new pull request #1916:
URL: https://github.com/apache/lucene-solr/pull/1916


   Ignoring the default issue template given that this PR is about a tiny fix 
for a typo. Right?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mayya-sharipova merged pull request #1915: Fix bug in sort optimization (#1903)

2020-09-23 Thread GitBox


mayya-sharipova merged pull request #1915:
URL: https://github.com/apache/lucene-solr/pull/1915


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mayya-sharipova opened a new pull request #1915: Fix bug in sort optimization (#1903)

2020-09-23 Thread GitBox


mayya-sharipova opened a new pull request #1915:
URL: https://github.com/apache/lucene-solr/pull/1915


   Fix bug how iterator with skipping functionality
   advances and produces docs
   
   Relates to #1725
   Backport for #1903



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] HoustonPutman commented on pull request #1863: SOLR-14701: GuessSchemaFields URP to replace AddSchemaFields URP in schemaless mode

2020-09-23 Thread GitBox


HoustonPutman commented on pull request #1863:
URL: https://github.com/apache/lucene-solr/pull/1863#issuecomment-697420014


   Purely responding to the URP response part, it’s definitely not possible for 
URP to send non-error responses. I do think its something we should implement 
though, since it will expand the use cases that URPs can solve. Ill create a 
JIRA for it.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-8281) Add RollupMergeStream to Streaming API

2020-09-23 Thread Gus Heck (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-8281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17200848#comment-17200848
 ] 

Gus Heck commented on SOLR-8281:


This seems related to something I wanted to do for a client... I had reduce 
with group() and I wanted to then feed the groups to an arbitrary streaming 
expression for further processing, and have the result show up in the groups 
(result would have been a matrix). Problem I stopped on was how to express the 
stream to process the group without having a source (the source is the group).

> Add RollupMergeStream to Streaming API
> --
>
> Key: SOLR-8281
> URL: https://issues.apache.org/jira/browse/SOLR-8281
> Project: Solr
>  Issue Type: Bug
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>Priority: Major
>
> The RollupMergeStream merges the aggregate results emitted by the 
> RollupStream on *worker* nodes.
> This is designed to be used in conjunction with the HashJoinStream to perform 
> rollup Aggregations on the joined Tuples. The HashJoinStream will require the 
> tuples to be partitioned on the Join keys. To avoid needing to repartition on 
> the *group by* fields for the RollupStream, we can perform a merge of the 
> rolled up Tuples coming from the workers.
> The construct would like this:
> {code}
> mergeRollup (...
>   parallel (...
> rollup (...
> hashJoin (
>   search(...),
>   search(...),
>   on="fieldA" 
> )
>  )
>  )
>)
> {code}
> The pseudo code above would push the *hashJoin* and *rollup* to the *worker* 
> nodes. The emitted rolled up tuples would be merged by the mergeRollup.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] arafalov commented on pull request #1863: SOLR-14701: GuessSchemaFields URP to replace AddSchemaFields URP in schemaless mode

2020-09-23 Thread GitBox


arafalov commented on pull request #1863:
URL: https://github.com/apache/lucene-solr/pull/1863#issuecomment-697406760


   Ok, I am glad we are on the same page that the current (let's call it _Add_) 
solution is rather bad despite all the great work put into it. Let's now get 
onto the same page about the next step you are actually proposing. I can read 
the rest of your statement in one of the following ways:
   
   1. Neither original _Add_ nor proposed _Guess_ solutions will address 
problem. **Next step: that discussion is not about code and should be taken up 
in the parent JIRA**. 
   That's exactly what it is there for and this code/PR is here to push the 
discussion from theoretical to practical.
   2. _Guess_ approach is ok overall, but the schema creation is still bad, 
could it return schema generation commands instead. I just double-checked code 
and there is no way for the current architecture to return non-error feedback 
(from either processCommit or SimplePostTool side). **Next step: Propose a way 
this could be done.** 
   Do note that the reason we are still an URP is because any schema guessing 
or creation depends on previous chain URPs to be always enabled (e.g. for 
custom dates formats); that is one of the things really broken with 
enable/disable flag for _Add_ solution and why I am doing the single-URP level 
flag.
   3. We need some other Guess approach. **Next action: propose alternative 
architecture, preferably as straw-man implementation**.
   This would give people on JIRA a chance to select from TWO ways forward, 
that would be amazing whether we end on one, another or merged solution.
   4. ??? Use veto and keep status quo until somebody yet different has a much 
better idea than people in last 3 JIRA? 
   5. ??? (I don't claim to read your mind, but I want to move this discussion 
forward in concrete non-blocking steps)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mayya-sharipova commented on a change in pull request #1903: Fix bug in sort optimization

2020-09-23 Thread GitBox


mayya-sharipova commented on a change in pull request #1903:
URL: https://github.com/apache/lucene-solr/pull/1903#discussion_r493610854



##
File path: 
lucene/core/src/test/org/apache/lucene/search/TestFieldSortOptimizationSkipping.java
##
@@ -432,7 +439,48 @@ public void testDocSortOptimization() throws IOException {
   assertTrue(topDocs.totalHits.value < 10); // assert that very few docs 
were collected
 }
 
+reader.close();
+dir.close();
+  }
+
+  /**
+   * Test that sorting on _doc works correctly.
+   * This test goes through DefaultBulkSorter::scoreRange, where 
scorerIterator is BitSetIterator.
+   * As a conjunction of this BitSetIterator with DocComparator's iterator, we 
get BitSetConjunctionDISI.
+   * BitSetConjuctionDISI advances based on the DocComparator's iterator, and 
doesn't consider
+   * that its BitSetIterator may have advanced passed a certain doc. 

Review comment:
   Issue created: https://issues.apache.org/jira/browse/LUCENE-9541





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9541) BitSetConjunctionDISI doesn't advance based on its components

2020-09-23 Thread Mayya Sharipova (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayya Sharipova updated LUCENE-9541:

Summary: BitSetConjunctionDISI doesn't advance based on its components  
(was: BitSetConjunctionDISI can advance to docs before its components)

> BitSetConjunctionDISI doesn't advance based on its components
> -
>
> Key: LUCENE-9541
> URL: https://issues.apache.org/jira/browse/LUCENE-9541
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Mayya Sharipova
>Priority: Minor
>
> Not completely sure if this is a bug.
> BitSetConjuctionDISI advances based on its lead  – DocIdSetIterator iterator, 
> and doesn't consider that its another component – BitSetIterator may have 
> already advanced passed a certain doc. This may result in duplicate documents.
> For example if BitSetConjuctionDISI  _disi_ is composed of DocIdSetIterator 
> _a_ of docs  [0,1] and BitSetIterator _b_ of docs [0,1].  Doing `b.nextDoc()` 
> we are collecting doc0,  doing `disi.nextDoc` we again  collecting the same 
> doc0.
> It seems that other conjunction iterators don't have this behaviour, if we 
> are advancing any of their component pass a certain document, the whole 
> conjunction iterator will also be advanced pass this document. 
>  
> This behaviour was exposed in this 
> [PR|https://github.com/apache/lucene-solr/pull/1903]. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9541) BitSetConjunctionDISI can advance to docs before its components

2020-09-23 Thread Mayya Sharipova (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayya Sharipova updated LUCENE-9541:

Description: 
Not completely sure if this is a bug.

BitSetConjuctionDISI advances based on its lead  – DocIdSetIterator iterator, 
and doesn't consider that its another component – BitSetIterator may have 
already advanced passed a certain doc. This may result in duplicate documents.

For example if BitSetConjuctionDISI  _disi_ is composed of DocIdSetIterator _a_ 
of docs  [0,1] and BitSetIterator _b_ of docs [0,1].  Doing `b.nextDoc()` we 
are collecting doc0,  doing `disi.nextDoc` we again  collecting the same doc0.

It seems that other conjunction iterators don't have this behaviour, if we are 
advancing any of their component pass a certain document, the whole conjunction 
iterator will also be advanced pass this document. 

 

This behaviour was exposed in this 
[PR|https://github.com/apache/lucene-solr/pull/1903]. 

 

  was:
Not completely sure if this is a bug.

BitSetConjuctionDISI advances based on its lead  – DocIdSetIterator iterator, 
and doesn't consider that its another component – BitSetIterator may have 
already advanced passed a certain doc. This may result in duplicate documents.

This behaviour was exposed in this 
[PR|https://github.com/apache/lucene-solr/pull/1903]. 

 


> BitSetConjunctionDISI can advance to docs before its components
> ---
>
> Key: LUCENE-9541
> URL: https://issues.apache.org/jira/browse/LUCENE-9541
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Mayya Sharipova
>Priority: Minor
>
> Not completely sure if this is a bug.
> BitSetConjuctionDISI advances based on its lead  – DocIdSetIterator iterator, 
> and doesn't consider that its another component – BitSetIterator may have 
> already advanced passed a certain doc. This may result in duplicate documents.
> For example if BitSetConjuctionDISI  _disi_ is composed of DocIdSetIterator 
> _a_ of docs  [0,1] and BitSetIterator _b_ of docs [0,1].  Doing `b.nextDoc()` 
> we are collecting doc0,  doing `disi.nextDoc` we again  collecting the same 
> doc0.
> It seems that other conjunction iterators don't have this behaviour, if we 
> are advancing any of their component pass a certain document, the whole 
> conjunction iterator will also be advanced pass this document. 
>  
> This behaviour was exposed in this 
> [PR|https://github.com/apache/lucene-solr/pull/1903]. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9541) BitSetConjunctionDISI can advance to docs before its components

2020-09-23 Thread Mayya Sharipova (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayya Sharipova updated LUCENE-9541:

Summary: BitSetConjunctionDISI can advance to docs before its components  
(was: BitSetConjunctionDISI can advance backwards from its components)

> BitSetConjunctionDISI can advance to docs before its components
> ---
>
> Key: LUCENE-9541
> URL: https://issues.apache.org/jira/browse/LUCENE-9541
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Mayya Sharipova
>Priority: Minor
>
> Not completely sure if this is a bug.
> BitSetConjuctionDISI advances based on its lead  – DocIdSetIterator iterator, 
> and doesn't consider that its another component – BitSetIterator may have 
> already advanced passed a certain doc. This may result in duplicate documents.
> This behaviour was exposed in this PR. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9541) BitSetConjunctionDISI can advance to docs before its components

2020-09-23 Thread Mayya Sharipova (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayya Sharipova updated LUCENE-9541:

Description: 
Not completely sure if this is a bug.

BitSetConjuctionDISI advances based on its lead  – DocIdSetIterator iterator, 
and doesn't consider that its another component – BitSetIterator may have 
already advanced passed a certain doc. This may result in duplicate documents.

This behaviour was exposed in this 
[PR|https://github.com/apache/lucene-solr/pull/1903]. 

 

  was:
Not completely sure if this is a bug.

BitSetConjuctionDISI advances based on its lead  – DocIdSetIterator iterator, 
and doesn't consider that its another component – BitSetIterator may have 
already advanced passed a certain doc. This may result in duplicate documents.

This behaviour was exposed in this PR. 

 


> BitSetConjunctionDISI can advance to docs before its components
> ---
>
> Key: LUCENE-9541
> URL: https://issues.apache.org/jira/browse/LUCENE-9541
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Mayya Sharipova
>Priority: Minor
>
> Not completely sure if this is a bug.
> BitSetConjuctionDISI advances based on its lead  – DocIdSetIterator iterator, 
> and doesn't consider that its another component – BitSetIterator may have 
> already advanced passed a certain doc. This may result in duplicate documents.
> This behaviour was exposed in this 
> [PR|https://github.com/apache/lucene-solr/pull/1903]. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-9541) BitSetConjunctionDISI can advance backwards from its components

2020-09-23 Thread Mayya Sharipova (Jira)
Mayya Sharipova created LUCENE-9541:
---

 Summary: BitSetConjunctionDISI can advance backwards from its 
components
 Key: LUCENE-9541
 URL: https://issues.apache.org/jira/browse/LUCENE-9541
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Mayya Sharipova


Not completely sure if this is a bug.

BitSetConjuctionDISI advances based on its lead  – DocIdSetIterator iterator, 
and doesn't consider that its another component – BitSetIterator may have 
already advanced passed a certain doc. This may result in duplicate documents.

This behaviour was exposed in this PR. 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mocobeta commented on pull request #1836: LUCENE-9317: Clean up split package in analyzers-common

2020-09-23 Thread GitBox


mocobeta commented on pull request #1836:
URL: https://github.com/apache/lucene-solr/pull/1836#issuecomment-697375125


   @uschindler seems busy.
   
   I don't want to maintain this branch for very long (the diff is so large), 
but I need at least one reviewer to proceed this.
   @dweiss would you take care this, if you have some time?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] munendrasn commented on pull request #1914: Move 9x upgrade notes out of changes.txt

2020-09-23 Thread GitBox


munendrasn commented on pull request #1914:
URL: https://github.com/apache/lucene-solr/pull/1914#issuecomment-697368032


   @noblepaul @sigram Please review. I have moved the entries added by you 
guys, so would prefer your reviews



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14787) Inequality support in Payload Check query parser

2020-09-23 Thread Gus Heck (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17200828#comment-17200828
 ] 

Gus Heck commented on SOLR-14787:
-

I have found something interesting WRT the failing case you mention... it only 
fails when I run the test in my IDE. If I use the ant build it passes. I notice 
some interesting differences in startup for these two scenarios... 

build:

 
{code:java}
   [junit4] Suite: org.apache.solr.search.TestPayloadCheckQParserPlugin
   [junit4]   2> 1454 INFO  
(SUITE-TestPayloadCheckQParserPlugin-seed#[AB5E0FC0380BB866]-worker) [ ] 
o.a.s.SolrTestCase Setting 'solr.default.confdir' system property to 
test-framework derived value of 
'/home/gus/projects/apache/lucene-solr/fork/lucene-solr8/solr/server/solr/configsets/_default/conf'
   [junit4]   2> 1475 INFO  
(SUITE-TestPayloadCheckQParserPlugin-seed#[AB5E0FC0380BB866]-worker) [ ] 
o.a.s.SolrTestCaseJ4 Created dataDir: 
/home/gus/projects/apache/lucene-solr/fork/lucene-solr8/solr/build/solr-core/test/J0/temp/solr.search.TestPayloadCheckQParserPlugin_AB5E0FC0380BB866-001/data-dir-1-001
   [junit4]   2> 1551 INFO  
(SUITE-TestPayloadCheckQParserPlugin-seed#[AB5E0FC0380BB866]-worker) [ ] 
o.a.s.SolrTestCaseJ4 Using TrieFields (NUMERIC_POINTS_SYSPROP=false) 
w/NUMERIC_DOCVALUES_SYSPROP=true
   [junit4]   2> 1592 INFO  
(SUITE-TestPayloadCheckQParserPlugin-seed#[AB5E0FC0380BB866]-worker) [ ] 
o.e.j.u.log Logging initialized @1620ms to org.eclipse.jetty.util.log.Slf4jLog
   [junit4]   2> 1597 INFO  
(SUITE-TestPayloadCheckQParserPlugin-seed#[AB5E0FC0380BB866]-worker) [ ] 
o.a.s.SolrTestCaseJ4 Randomized ssl (false) and clientAuth (true) via: 
@org.apache.solr.util.RandomizeSSL(reason=, ssl=NaN, value=NaN, clientAuth=NaN)
   [junit4]   2> 1621 INFO  
(SUITE-TestPayloadCheckQParserPlugin-seed#[AB5E0FC0380BB866]-worker) [ ] 
o.a.s.SolrTestCaseJ4 SecureRandom sanity checks: 
test.solr.allowed.securerandom=null & java.security.egd=file:/dev/./urandom
   [junit4]   2> 1626 INFO  
(SUITE-TestPayloadCheckQParserPlugin-seed#[AB5E0FC0380BB866]-worker) [ ] 
o.a.s.SolrTestCaseJ4 initCore
   [junit4]   2> 1757 INFO  
(SUITE-TestPayloadCheckQParserPlugin-seed#[AB5E0FC0380BB866]-worker) [ ] 
o.a.s.c.SolrConfig Using Lucene MatchVersion: 8.7.0
   [junit4]   2> 1901 INFO  
(SUITE-TestPayloadCheckQParserPlugin-seed#[AB5E0FC0380BB866]-worker) [ ] 
o.a.s.s.IndexSchema Schema name=example
   [junit4]   2> 1931 WARN  
(SUITE-TestPayloadCheckQParserPlugin-seed#[AB5E0FC0380BB866]-worker) [ ] 
o.a.s.c.SolrResourceLoader Solr loaded a deprecated plugin/analysis class 
[solr.TrieIntField]. Please consult documentation how to replace it accordingly.
   [junit4]   2> 1936 WARN  
(SUITE-TestPayloadCheckQParserPlugin-seed#[AB5E0FC0380BB866]-worker) [ ] 
o.a.s.c.SolrResourceLoader Solr loaded a deprecated plugin/analysis class 
[solr.TrieFloatField]. Please consult documentation how to replace it 
accordingly.
   [junit4]   2> 1940 WARN  
(SUITE-TestPayloadCheckQParserPlugin-seed#[AB5E0FC0380BB866]-worker) [ ] 
o.a.s.c.SolrResourceLoader Solr loaded a deprecated plugin/analysis class 
[solr.TrieLongField]. Please consult documentation how to replace it 
accordingly.
   [junit4]   2> 1944 WARN  
(SUITE-TestPayloadCheckQParserPlugin-seed#[AB5E0FC0380BB866]-worker) [ ] 
o.a.s.c.SolrResourceLoader Solr loaded a deprecated plugin/analysis class 
[solr.TrieDoubleField]. Please consult documentation how to replace it 
accordingly.
   [junit4]   2> 1966 WARN  
(SUITE-TestPayloadCheckQParserPlugin-seed#[AB5E0FC0380BB866]-worker) [ ] 
o.a.s.c.SolrResourceLoader Solr loaded a deprecated plugin/analysis class 
[solr.TrieDateField]. Please consult documentation how to replace it 
accordingly.
   [junit4]   2> 2202 WARN  
(SUITE-TestPayloadCheckQParserPlugin-seed#[AB5E0FC0380BB866]-worker) [ ] 
o.a.s.c.SolrResourceLoader Solr loaded a deprecated plugin/analysis class 
[solr.GeoHashField]. Please consult documentation how to replace it accordingly.
   [junit4]   2> 2208 WARN  
(SUITE-TestPayloadCheckQParserPlugin-seed#[AB5E0FC0380BB866]-worker) [ ] 
o.a.s.c.SolrResourceLoader Solr loaded a deprecated plugin/analysis class 
[solr.LatLonType]. Please consult documentation how to replace it accordingly.
   [junit4]   2> 2217 WARN  
(SUITE-TestPayloadCheckQParserPlugin-seed#[AB5E0FC0380BB866]-worker) [ ] 
o.a.s.c.SolrResourceLoader Solr loaded a deprecated plugin/analysis class 
[solr.EnumField]. Please consult documentation how to replace it accordingly.


{code}
IDE (Intellij)

 

 
{code:java}
1172 INFO  (SUITE-TestPayloadCheckQParserPlugin-seed#[5A2517E33080AEE6]-worker) 
[ ] o.a.s.SolrTestCase Setting 'solr.default.confdir' system property to 
test-framework derived value of 
'/home/gus/projects/apache/lucene-solr/fork/lucene-solr/solr/server/solr/configsets/_default/conf'
1190 INFO  

[GitHub] [lucene-solr] munendrasn opened a new pull request #1914: Move 9x upgrade notes out of changes.txt

2020-09-23 Thread GitBox


munendrasn opened a new pull request #1914:
URL: https://github.com/apache/lucene-solr/pull/1914


   Upgrade notes have been moved out of changes.txt. While working on PR #1900, 
I found there were few entries which were still present in changes.txt (most 
likely added at a later time)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] munendrasn commented on pull request #1900: SOLR-14036: Remove explicit distrib=false from /terms handler

2020-09-23 Thread GitBox


munendrasn commented on pull request #1900:
URL: https://github.com/apache/lucene-solr/pull/1900#issuecomment-697360102


   I have included the changes and upgrade entry. Instead of adding upgrade 
entry to `solr-upgrade-notes.adoc`, I have added to 
`major-changes-in-solr-9.adoc` as mentioned the former doc



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mayya-sharipova merged pull request #1903: Fix bug in sort optimization

2020-09-23 Thread GitBox


mayya-sharipova merged pull request #1903:
URL: https://github.com/apache/lucene-solr/pull/1903


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mayya-sharipova commented on a change in pull request #1903: Fix bug in sort optimization

2020-09-23 Thread GitBox


mayya-sharipova commented on a change in pull request #1903:
URL: https://github.com/apache/lucene-solr/pull/1903#discussion_r493568956



##
File path: 
lucene/core/src/test/org/apache/lucene/search/TestFieldSortOptimizationSkipping.java
##
@@ -432,7 +439,48 @@ public void testDocSortOptimization() throws IOException {
   assertTrue(topDocs.totalHits.value < 10); // assert that very few docs 
were collected
 }
 
+reader.close();
+dir.close();
+  }
+
+  /**
+   * Test that sorting on _doc works correctly.
+   * This test goes through DefaultBulkSorter::scoreRange, where 
scorerIterator is BitSetIterator.
+   * As a conjunction of this BitSetIterator with DocComparator's iterator, we 
get BitSetConjunctionDISI.
+   * BitSetConjuctionDISI advances based on the DocComparator's iterator, and 
doesn't consider
+   * that its BitSetIterator may have advanced passed a certain doc. 

Review comment:
   I will create an issue for this. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14503) Solr does not respect waitForZk (SOLR_WAIT_FOR_ZK) property

2020-09-23 Thread Colvin Cowie (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17200811#comment-17200811
 ] 

Colvin Cowie commented on SOLR-14503:
-

Hi [~munendrasn], thanks. Sorry I've not got any time at the moment. Thanks

> Solr does not respect waitForZk (SOLR_WAIT_FOR_ZK) property
> ---
>
> Key: SOLR-14503
> URL: https://issues.apache.org/jira/browse/SOLR-14503
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 7.1, 7.2, 7.2.1, 7.3, 7.3.1, 7.4, 7.5, 7.6, 7.7, 7.7.1, 
> 7.7.2, 8.0, 8.1, 8.2, 7.7.3, 8.1.1, 8.3, 8.4, 8.3.1, 8.5, 8.4.1, 8.5.1
>Reporter: Colvin Cowie
>Assignee: Munendra S N
>Priority: Minor
> Attachments: SOLR-14503.patch, SOLR-14503.patch
>
>
> When starting Solr in cloud mode, if zookeeper is not available within 30 
> seconds, then core container intialization fails and the node will not 
> recover when zookeeper is available.
>  
> I believe SOLR-5129 should have addressed this issue, however it doesn't 
> quite do so for two reasons:
>  # 
> [https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/servlet/SolrDispatchFilter.java#L297]
>  it calls {{SolrZkClient(String zkServerAddress, int zkClientTimeout)}} 
> rather than {{SolrZkClient(String zkServerAddress, int zkClientTimeout, int 
> zkClientConnectTimeout)}} so the DEFAULT_CLIENT_CONNECT_TIMEOUT of 30 seconds 
> is used even when you specify a different waitForZk value
>  # bin/solr contains script to set -DwaitForZk from the SOLR_WAIT_FOR_ZK 
> environment property 
> [https://github.com/apache/lucene-solr/blob/master/solr/bin/solr#L2148] but 
> there is no corresponding assignment in bin/solr.cmd, while SOLR_WAIT_FOR_ZK 
> appears in the solr.in.cmd as an example.
>  
> I will attach a patch that fixes the above.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Assigned] (SOLR-14333) Implement toString() in CollapsingPostFilter

2020-09-23 Thread Munendra S N (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Munendra S N reassigned SOLR-14333:
---

Assignee: Munendra S N

> Implement toString() in CollapsingPostFilter
> 
>
> Key: SOLR-14333
> URL: https://issues.apache.org/jira/browse/SOLR-14333
> Project: Solr
>  Issue Type: Improvement
>Reporter: Munendra S N
>Assignee: Munendra S N
>Priority: Major
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> {{toString()}} is not overridden in CollapsingPostFilter. Debug component 
> returns {{parsed_filter_queries}}, for multiple CollapsingPostFilter in 
> request, value in {{parsed_filter_queries}} is always 
> {{CollapsingPostFilter()}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14503) Solr does not respect waitForZk (SOLR_WAIT_FOR_ZK) property

2020-09-23 Thread Munendra S N (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17200808#comment-17200808
 ] 

Munendra S N commented on SOLR-14503:
-

I'm planning to commit current patch and handle other cases of zkClientTimeout 
usage in a separate issue

> Solr does not respect waitForZk (SOLR_WAIT_FOR_ZK) property
> ---
>
> Key: SOLR-14503
> URL: https://issues.apache.org/jira/browse/SOLR-14503
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 7.1, 7.2, 7.2.1, 7.3, 7.3.1, 7.4, 7.5, 7.6, 7.7, 7.7.1, 
> 7.7.2, 8.0, 8.1, 8.2, 7.7.3, 8.1.1, 8.3, 8.4, 8.3.1, 8.5, 8.4.1, 8.5.1
>Reporter: Colvin Cowie
>Assignee: Munendra S N
>Priority: Minor
> Attachments: SOLR-14503.patch, SOLR-14503.patch
>
>
> When starting Solr in cloud mode, if zookeeper is not available within 30 
> seconds, then core container intialization fails and the node will not 
> recover when zookeeper is available.
>  
> I believe SOLR-5129 should have addressed this issue, however it doesn't 
> quite do so for two reasons:
>  # 
> [https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/servlet/SolrDispatchFilter.java#L297]
>  it calls {{SolrZkClient(String zkServerAddress, int zkClientTimeout)}} 
> rather than {{SolrZkClient(String zkServerAddress, int zkClientTimeout, int 
> zkClientConnectTimeout)}} so the DEFAULT_CLIENT_CONNECT_TIMEOUT of 30 seconds 
> is used even when you specify a different waitForZk value
>  # bin/solr contains script to set -DwaitForZk from the SOLR_WAIT_FOR_ZK 
> environment property 
> [https://github.com/apache/lucene-solr/blob/master/solr/bin/solr#L2148] but 
> there is no corresponding assignment in bin/solr.cmd, while SOLR_WAIT_FOR_ZK 
> appears in the solr.in.cmd as an example.
>  
> I will attach a patch that fixes the above.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Assigned] (SOLR-14503) Solr does not respect waitForZk (SOLR_WAIT_FOR_ZK) property

2020-09-23 Thread Munendra S N (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Munendra S N reassigned SOLR-14503:
---

Assignee: Munendra S N

> Solr does not respect waitForZk (SOLR_WAIT_FOR_ZK) property
> ---
>
> Key: SOLR-14503
> URL: https://issues.apache.org/jira/browse/SOLR-14503
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 7.1, 7.2, 7.2.1, 7.3, 7.3.1, 7.4, 7.5, 7.6, 7.7, 7.7.1, 
> 7.7.2, 8.0, 8.1, 8.2, 7.7.3, 8.1.1, 8.3, 8.4, 8.3.1, 8.5, 8.4.1, 8.5.1
>Reporter: Colvin Cowie
>Assignee: Munendra S N
>Priority: Minor
> Attachments: SOLR-14503.patch, SOLR-14503.patch
>
>
> When starting Solr in cloud mode, if zookeeper is not available within 30 
> seconds, then core container intialization fails and the node will not 
> recover when zookeeper is available.
>  
> I believe SOLR-5129 should have addressed this issue, however it doesn't 
> quite do so for two reasons:
>  # 
> [https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/servlet/SolrDispatchFilter.java#L297]
>  it calls {{SolrZkClient(String zkServerAddress, int zkClientTimeout)}} 
> rather than {{SolrZkClient(String zkServerAddress, int zkClientTimeout, int 
> zkClientConnectTimeout)}} so the DEFAULT_CLIENT_CONNECT_TIMEOUT of 30 seconds 
> is used even when you specify a different waitForZk value
>  # bin/solr contains script to set -DwaitForZk from the SOLR_WAIT_FOR_ZK 
> environment property 
> [https://github.com/apache/lucene-solr/blob/master/solr/bin/solr#L2148] but 
> there is no corresponding assignment in bin/solr.cmd, while SOLR_WAIT_FOR_ZK 
> appears in the solr.in.cmd as an example.
>  
> I will attach a patch that fixes the above.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9539) Improve memory footprint of SortingCodecReader

2020-09-23 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17200795#comment-17200795
 ] 

ASF subversion and git services commented on LUCENE-9539:
-

Commit 427e11c7f644a05be93bb801ca394b90dccf8df6 in lucene-solr's branch 
refs/heads/branch_8x from Simon Willnauer
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=427e11c ]

LUCENE-9539: Remove caches from SortingCodecReader (#1909)

SortingCodecReader keeps all docvalues in memory that are loaded from this 
reader.
Yet, this reader should only be used for merging which happens sequentially. 
This makes
caching docvalues unnecessary.

Co-authored-by: Jim Ferenczi 

> Improve memory footprint of SortingCodecReader
> --
>
> Key: LUCENE-9539
> URL: https://issues.apache.org/jira/browse/LUCENE-9539
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Simon Willnauer
>Priority: Major
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> SortingCodecReader is a very memory heavy since it needs to re-sort and load 
> large parts of the index into memory. We can try to make it more efficient by 
> using more compact internal data-structures, remove the caches it uses 
> provided we define it's usage as a merge only reader wrapper. Ultimately we 
> need to find a way to allow the reader or some other structure to minimize 
> its heap memory. One way is to slice existing readers and merge them in 
> multiple steps. There will be multiple steps towards a more useable version 
> of this class.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] s1monw merged pull request #1909: LUCENE-9539: Remove caches from SortingCodecReader

2020-09-23 Thread GitBox


s1monw merged pull request #1909:
URL: https://github.com/apache/lucene-solr/pull/1909


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9539) Improve memory footprint of SortingCodecReader

2020-09-23 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17200789#comment-17200789
 ] 

ASF subversion and git services commented on LUCENE-9539:
-

Commit 17c285d61743da0c06735e06235b20bd5aac4e14 in lucene-solr's branch 
refs/heads/master from Simon Willnauer
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=17c285d ]

LUCENE-9539: Remove caches from SortingCodecReader (#1909)

SortingCodecReader keeps all docvalues in memory that are loaded from this 
reader.
Yet, this reader should only be used for merging which happens sequentially. 
This makes
caching docvalues unnecessary.

Co-authored-by: Jim Ferenczi 

> Improve memory footprint of SortingCodecReader
> --
>
> Key: LUCENE-9539
> URL: https://issues.apache.org/jira/browse/LUCENE-9539
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Simon Willnauer
>Priority: Major
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> SortingCodecReader is a very memory heavy since it needs to re-sort and load 
> large parts of the index into memory. We can try to make it more efficient by 
> using more compact internal data-structures, remove the caches it uses 
> provided we define it's usage as a merge only reader wrapper. Ultimately we 
> need to find a way to allow the reader or some other structure to minimize 
> its heap memory. One way is to slice existing readers and merge them in 
> multiple steps. There will be multiple steps towards a more useable version 
> of this class.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-11167) bin/solr uses $SOLR_STOP_WAIT during start

2020-09-23 Thread Christine Poerschke (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-11167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christine Poerschke updated SOLR-11167:
---
Fix Version/s: 8.7
   master (9.0)

> bin/solr uses $SOLR_STOP_WAIT during start
> --
>
> Key: SOLR-11167
> URL: https://issues.apache.org/jira/browse/SOLR-11167
> Project: Solr
>  Issue Type: Improvement
>  Components: scripts and tools
>Reporter: Christine Poerschke
>Assignee: Christine Poerschke
>Priority: Minor
> Fix For: master (9.0), 8.7
>
> Attachments: SOLR-11167.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> bin/solr using $SOLR_STOP_WAIT during start is unexpected, I think it would 
> be clearer to have a separate $SOLR_START_WAIT variable.
> related minor thing: SOLR_STOP_WAIT is mentioned in solr.in.sh but not in 
> solr.in.cmd equivalent.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Assigned] (SOLR-11167) bin/solr uses $SOLR_STOP_WAIT during start

2020-09-23 Thread Christine Poerschke (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-11167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christine Poerschke reassigned SOLR-11167:
--

Assignee: Christine Poerschke

> bin/solr uses $SOLR_STOP_WAIT during start
> --
>
> Key: SOLR-11167
> URL: https://issues.apache.org/jira/browse/SOLR-11167
> Project: Solr
>  Issue Type: Improvement
>  Components: scripts and tools
>Reporter: Christine Poerschke
>Assignee: Christine Poerschke
>Priority: Minor
> Attachments: SOLR-11167.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> bin/solr using $SOLR_STOP_WAIT during start is unexpected, I think it would 
> be clearer to have a separate $SOLR_START_WAIT variable.
> related minor thing: SOLR_STOP_WAIT is mentioned in solr.in.sh but not in 
> solr.in.cmd equivalent.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-11167) bin/solr uses $SOLR_STOP_WAIT during start

2020-09-23 Thread Christine Poerschke (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-11167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17200786#comment-17200786
 ] 

Christine Poerschke commented on SOLR-11167:


Oops, a three year old ticket, not quite sure what happened here, apologies 
[~omar_abdelnabi]. Thanks for attaching a patch!

The patch after all this time unfortunately doesn't apply to the current master 
branch anymore. Hence I've replaced it with 
[https://github.com/apache/lucene-solr/pull/1913] instead, with two small 
differences:
 * {{solr.in.cmd}} changes left out of scope i.e. since it does not yet use 
$SOLR_STOP_WAIT currently it would be clearer to separately add 
$SOLR_START_WAIT and $SOLR_STOP_WAIT support for {{solr.cmd}}
 * instead of {{SOLR_START_WAIT=180}} initialisation (if no SOLR_START_WAIT was 
supplied) using {{SOLR_START_WAIT=$SOLR_STOP_WAIT}} will help ensure backwards 
compatibility for users that currently customise SOLR_STOP_WAIT e.g. anyone is 
currently setting {{SOLR_STOP_WAIT=42}} then they will continue to see 42s used 
for both stop and start even if they don't explicitly configure 
{{SOLR_START_WAIT=42}}

> bin/solr uses $SOLR_STOP_WAIT during start
> --
>
> Key: SOLR-11167
> URL: https://issues.apache.org/jira/browse/SOLR-11167
> Project: Solr
>  Issue Type: Improvement
>  Components: scripts and tools
>Reporter: Christine Poerschke
>Priority: Minor
> Attachments: SOLR-11167.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> bin/solr using $SOLR_STOP_WAIT during start is unexpected, I think it would 
> be clearer to have a separate $SOLR_START_WAIT variable.
> related minor thing: SOLR_STOP_WAIT is mentioned in solr.in.sh but not in 
> solr.in.cmd equivalent.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14890) Refactor code to use annotations for configset API

2020-09-23 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17200783#comment-17200783
 ] 

ASF subversion and git services commented on SOLR-14890:


Commit fd0c08615df9440061e5ae664dcfa3f5a7600568 in lucene-solr's branch 
refs/heads/master from Noble Paul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=fd0c086 ]

SOLR-14890: Refactor code to use annotations for configset API (#1911)



> Refactor code to use annotations for configset API
> --
>
> Key: SOLR-14890
> URL: https://issues.apache.org/jira/browse/SOLR-14890
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14890) Refactor code to use annotations for configset API

2020-09-23 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17200784#comment-17200784
 ] 

ASF subversion and git services commented on SOLR-14890:


Commit fd0c08615df9440061e5ae664dcfa3f5a7600568 in lucene-solr's branch 
refs/heads/master from Noble Paul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=fd0c086 ]

SOLR-14890: Refactor code to use annotations for configset API (#1911)



> Refactor code to use annotations for configset API
> --
>
> Key: SOLR-14890
> URL: https://issues.apache.org/jira/browse/SOLR-14890
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] cpoerschke opened a new pull request #1913: SOLR-11167: Avoid $SOLR_STOP_WAIT use during 'bin/solr start' if $SOLR_START_WAIT is supplied.

2020-09-23 Thread GitBox


cpoerschke opened a new pull request #1913:
URL: https://github.com/apache/lucene-solr/pull/1913


   https://issues.apache.org/jira/browse/SOLR-11167



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14890) Refactor code to use annotations for configset API

2020-09-23 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17200779#comment-17200779
 ] 

ASF subversion and git services commented on SOLR-14890:


Commit fd0c08615df9440061e5ae664dcfa3f5a7600568 in lucene-solr's branch 
refs/heads/master from Noble Paul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=fd0c086 ]

SOLR-14890: Refactor code to use annotations for configset API (#1911)



> Refactor code to use annotations for configset API
> --
>
> Key: SOLR-14890
> URL: https://issues.apache.org/jira/browse/SOLR-14890
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] noblepaul merged pull request #1911: SOLR-14890: Refactor code to use annotations for configset API

2020-09-23 Thread GitBox


noblepaul merged pull request #1911:
URL: https://github.com/apache/lucene-solr/pull/1911


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14890) Refactor code to use annotations for configset API

2020-09-23 Thread Noble Paul (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-14890:
--
Summary: Refactor code to use annotations for configset API  (was: Refactor 
code to use annotations for cluster API)

> Refactor code to use annotations for configset API
> --
>
> Key: SOLR-14890
> URL: https://issues.apache.org/jira/browse/SOLR-14890
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jpountz commented on a change in pull request #1909: LUCENE-9539: Remove caches from SortingCodecReader

2020-09-23 Thread GitBox


jpountz commented on a change in pull request #1909:
URL: https://github.com/apache/lucene-solr/pull/1909#discussion_r493490751



##
File path: lucene/core/src/java/org/apache/lucene/index/SortingCodecReader.java
##
@@ -510,4 +457,52 @@ public LeafMetaData getMetaData() {
 return metaData;
   }
 
+  // we try to cache the last used DV or Norms instance since during merge
+  // this instance is used more than once. We could in addition to this single 
instance
+  // also cache the fields that are used for sorting since we do the work 
twice for these fields
+  private String cachedField;
+  private Object cachedObject;
+  private boolean cacheIsNorms;
+
+  private  T getOrCreateNorms(String field, IOSupplier supplier) throws 
IOException {
+return getOrCreate(field, true, supplier);
+  }
+
+  @SuppressWarnings("unchecked")
+  private synchronized   T getOrCreate(String field, boolean norms, 
IOSupplier supplier) throws IOException {
+if ((field.equals(cachedField) && cacheIsNorms == norms) == false) {
+  assert assertCreatedOnlyOnce(field, norms);
+  cachedObject = supplier.get();
+  cachedField = field;
+  cacheIsNorms = norms;
+
+}
+assert cachedObject != null;
+return (T) cachedObject;
+  }
+
+  private final Map cacheStats = new HashMap<>(); // only 
with assertions enabled
+  private boolean assertCreatedOnlyOnce(String field, boolean norms) {
+assert Thread.holdsLock(this);
+// this is mainly there to make sure we change anything in the way we 
merge we realize it early
+Integer timesCached = cacheStats.compute(field + "N:" + norms, (s, i) -> i 
== null ? 1 : i.intValue() + 1);
+if (timesCached > 1) {
+  assert norms == false :"[" + field + "] norms must not be cached twice";

Review comment:
   Ah I had forgotten we were doing things this way. Then ignore my comment!





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] noblepaul edited a comment on pull request #1863: SOLR-14701: GuessSchemaFields URP to replace AddSchemaFields URP in schemaless mode

2020-09-23 Thread GitBox


noblepaul edited a comment on pull request #1863:
URL: https://github.com/apache/lucene-solr/pull/1863#issuecomment-697194702


   >Strong words there "worse than useless", especially considering that this - 
to me - seems a strong improvement on the current schemaless mode as it looks 
at more values and actually supports single/multivalued fields.
   
   I'm sorry for the confusion.
   
   I was referring to the current solution we have in Solr (schemaless, guess 
schema thing) . It's not a comment on the new solution. The current schemaless 
is indeed worse than useless
   
   >Generating Schema JSON raises its own questions, such as the shape of the 
schema it will be applied to, as guessing is currently happening as a 
differential to the existing schema. 
   
   The command is only relevant for that moment. If you execute it right away, 
it's useful. Users most likely will just copy paste the command (and edit, if 
required) 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



  1   2   >