[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1724: SOLR-14684: CloudExitableDirectoryReaderTest failing about 25% of the time

2020-08-10 Thread GitBox


dsmiley commented on a change in pull request #1724:
URL: https://github.com/apache/lucene-solr/pull/1724#discussion_r468335971



##
File path: 
solr/solrj/src/java/org/apache/solr/client/solrj/impl/LBSolrClient.java
##
@@ -155,6 +159,7 @@ public ServerIterator(Req req, Map 
zombieServers) {
   this.req = req;
   this.zombieServers = zombieServers;
   this.timeAllowedNano = getTimeAllowedInNanos(req.getRequest());
+  log.info("TimeAllowedNano:{}", this.timeAllowedNano);

Review comment:
   Are you sure we should log at info level here?  This seems more like a 
debug situation.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14680) Provide simple interfaces to our concrete SolrCloud classes

2020-08-10 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17175239#comment-17175239
 ] 

ASF subversion and git services commented on SOLR-14680:


Commit 15ae014c598c0c02926ca3d7039f6389488e981e in lucene-solr's branch 
refs/heads/master from Noble Paul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=15ae014 ]

SOLR-14680: Provide simple interfaces to our cloud classes  (only API) (#1694)



> Provide simple interfaces to our concrete SolrCloud classes
> ---
>
> Key: SOLR-14680
> URL: https://issues.apache.org/jira/browse/SOLR-14680
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Minor
>  Time Spent: 10h 10m
>  Remaining Estimate: 0h
>
> All our current implementations of SolrCloud such as 
> # ClusterState
> # DocCollection
> # Slice
> # Replica
> etc are concrete classes. Providing alternate implementations or wrappers is 
> extremely difficult. 
> SOLR-14613 is attempting to create  such interfaces to make their sdk simpler
> The objective is not to have a comprehensive set of methods in these 
> interfaces. We will start out with a subset of required interfaces. We 
> guarantee is that signatures of methods in these interfaces will not be 
> deleted/changed . But we may add more methods as and when it suits us



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] noblepaul merged pull request #1694: SOLR-14680: Provide simple interfaces to our cloud classes (only API)

2020-08-10 Thread GitBox


noblepaul merged pull request #1694:
URL: https://github.com/apache/lucene-solr/pull/1694


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dsmiley opened a new pull request #1735: LUCENE spell: Implement SuggestWord.toString

2020-08-10 Thread GitBox


dsmiley opened a new pull request #1735:
URL: https://github.com/apache/lucene-solr/pull/1735


   This is simply an obvious toString impl on SuggestWord.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] madrob commented on a change in pull request #1726: SOLR-14722: timeAllowed should track from req creation

2020-08-10 Thread GitBox


madrob commented on a change in pull request #1726:
URL: https://github.com/apache/lucene-solr/pull/1726#discussion_r468315778



##
File path: solr/core/src/java/org/apache/solr/search/SolrQueryTimeoutImpl.java
##
@@ -67,8 +69,21 @@ public boolean shouldExit() {
   }
 
   /**
-   * Method to set the time at which the timeOut should happen.
-   * @param timeAllowed set the time at which this thread should timeout.
+   * Sets or clears the time allowed based on how much time remains from the 
start of the request plus the configured
+   * {@link CommonParams#TIME_ALLOWED}.
+   */
+  public static void set(SolrQueryRequest req) {
+long timeAllowed = req.getParams().getLong(CommonParams.TIME_ALLOWED, -1L);
+if (timeAllowed >= 0L) {
+  set(timeAllowed - (long)req.getRequestTimer().getTime()); // reduce by 
time already spent
+} else {
+  reset();
+}
+  }
+
+  /**
+   * Sets the time allowed (milliseconds), assuming we start a timer 
immediately.
+   * You should probably invoke {@link #set(SolrQueryRequest)} instead.
*/
   public static void set(Long timeAllowed) {

Review comment:
   should this be a primitive instead of a boxed type?

##
File path: solr/core/src/java/org/apache/solr/search/SolrQueryTimeoutImpl.java
##
@@ -67,8 +69,21 @@ public boolean shouldExit() {
   }
 
   /**
-   * Method to set the time at which the timeOut should happen.
-   * @param timeAllowed set the time at which this thread should timeout.
+   * Sets or clears the time allowed based on how much time remains from the 
start of the request plus the configured
+   * {@link CommonParams#TIME_ALLOWED}.
+   */
+  public static void set(SolrQueryRequest req) {
+long timeAllowed = req.getParams().getLong(CommonParams.TIME_ALLOWED, -1L);
+if (timeAllowed >= 0L) {

Review comment:
   Should be `>`, not `>=`. Doc on time allowed state that zero is no 
timeout, not immediate timeout. Looks like we were previously inconsistent 
about this.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] noblepaul commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for placement plugin interface

2020-08-10 Thread GitBox


noblepaul commented on a change in pull request #1684:
URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r468260502



##
File path: 
solr/core/src/java/org/apache/solr/cluster/placement/plugins/SamplePluginMinimizeCores.java
##
@@ -0,0 +1,132 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.cluster.placement.plugins;
+
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.Comparator;
+import java.util.HashSet;
+import java.util.Iterator;
+import java.util.Set;
+import java.util.Map;
+
+import com.google.common.collect.Ordering;
+import com.google.common.collect.TreeMultimap;
+import org.apache.solr.cluster.placement.Cluster;
+import org.apache.solr.cluster.placement.CoresCountPropertyValue;
+import org.apache.solr.cluster.placement.CreateNewCollectionPlacementRequest;
+import org.apache.solr.cluster.placement.Node;
+import org.apache.solr.cluster.placement.PlacementException;
+import org.apache.solr.cluster.placement.PlacementPlugin;
+import org.apache.solr.cluster.placement.PropertyKey;
+import org.apache.solr.cluster.placement.PropertyKeyFactory;
+import org.apache.solr.cluster.placement.PropertyValue;
+import org.apache.solr.cluster.placement.PropertyValueFetcher;
+import org.apache.solr.cluster.placement.Replica;
+import org.apache.solr.cluster.placement.ReplicaPlacement;
+import org.apache.solr.cluster.placement.PlacementRequest;
+import org.apache.solr.cluster.placement.PlacementPlan;
+import org.apache.solr.cluster.placement.PlacementPlanFactory;
+import org.apache.solr.common.util.SuppressForbidden;
+
+/**
+ * Implements placing replicas to minimize number of cores per {@link Node}, 
while not placing two replicas of the same
+ * shard on the same node.
+ *
+ * TODO: code not tested and never run, there are no implementation yet for 
used interfaces
+ */
+public class SamplePluginMinimizeCores implements PlacementPlugin {
+
+  @SuppressForbidden(reason = "Ordering.arbitrary() has no equivalent in 
Comparator class. Rather reuse than copy.")
+  public PlacementPlan computePlacement(Cluster cluster, PlacementRequest 
placementRequest, PropertyKeyFactory propertyFactory,
+PropertyValueFetcher propertyFetcher, 
PlacementPlanFactory placementPlanFactory) throws PlacementException {
+// This plugin only supports Creating a collection.
+if (!(placementRequest instanceof CreateNewCollectionPlacementRequest)) {
+  throw new PlacementException("This toy plugin only supports creating 
collections");
+}
+
+final CreateNewCollectionPlacementRequest reqCreateCollection = 
(CreateNewCollectionPlacementRequest) placementRequest;
+
+final int totalReplicasPerShard = 
reqCreateCollection.getNrtReplicationFactor() +
+reqCreateCollection.getTlogReplicationFactor() + 
reqCreateCollection.getPullReplicationFactor();
+
+if (cluster.getLiveNodes().size() < totalReplicasPerShard) {
+  throw new PlacementException("Cluster size too small for number of 
replicas per shard");
+}
+
+// Get number of cores on each Node
+TreeMultimap nodesByCores = 
TreeMultimap.create(Comparator.naturalOrder(), Ordering.arbitrary());

Review comment:
   I believe the property fetching is overly complicated . We should 
probably make it a lot simpler. 
   
   Basically, the only requirement is strong typing. 
   
   `TreeMultimap nodesByCores = 
TreeMultimap.create(Comparator.naturalOrder(), Ordering.arbitrary());`
   
   This definitely is not the easiest code we could write. A user just wants to 
get an integer value for the # of cores in a node. 
   
   

##
File path: 
solr/core/src/java/org/apache/solr/cluster/placement/plugins/SamplePluginMinimizeCores.java
##
@@ -0,0 +1,132 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the 

[jira] [Commented] (SOLR-13412) Make the Lucene Luke module available from a Solr distribution

2020-08-10 Thread Erick Erickson (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17175124#comment-17175124
 ] 

Erick Erickson commented on SOLR-13412:
---

Given the discussion on the dev list, I'm going to close this as "won't fix" 
absent objections. I think the most cogent comment is that we should enhance 
the Luke Request Handler if there's a need rather than try to awkwardly package 
a windowing app with a server distro.

The origin of this Jira was "Hey! Luke has been integrated with Lucene, cool! 
Let's make it available from Solr". But as the discussion has continued, it 
seems like a poorer idea than it did at the start.

 

> Make the Lucene Luke module available from a Solr distribution
> --
>
> Key: SOLR-13412
> URL: https://issues.apache.org/jira/browse/SOLR-13412
> Project: Solr
>  Issue Type: Improvement
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
> Attachments: SOLR-13412.patch
>
>
> Now that [~Tomoko Uchida] has put in a great effort to bring Luke into the 
> project, I think it would be good to be able to access it from a Solr distro.
> I want to go to the right place under the Solr install directory and start 
> Luke up to examine the local indexes. 
> This ticket is explicitly _not_ about accessing it from the admin UI, Luke is 
> a stand-alone app that must be invoked on the node that has a Lucene index on 
> the local filesystem
> We need to 
>  * have it included in Solr when running "ant package". 
>  * add some bits to the ref guide on how to invoke
>  ** Where to invoke it from
>  ** mention anything that has to be installed.
>  ** any other "gotchas" someone just installing Solr should be aware of.
>  * Ant should not be necessary.
>  * 
>  
> I'll assign this to myself to keep track of, but would not be offended in the 
> least if someone with more knowledge of "ant package" and the like wanted to 
> take it over ;)
> If we can do it at all



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-14636) Provide a reference implementation for SolrCloud that is stable and fast.

2020-08-10 Thread Mark Robert Miller (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17169538#comment-17169538
 ] 

Mark Robert Miller edited comment on SOLR-14636 at 8/11/20, 12:36 AM:
--

I ended up having some other responsibilities last week, so this milestone has 
been pushed out a week.


was (Author: markrmiller):
There is not enough fun that goes on in development anymore. Robert and I used 
to have that nailed.

 !solr-ref-branch.gif!

> Provide a reference implementation for SolrCloud that is stable and fast.
> -
>
> Key: SOLR-14636
> URL: https://issues.apache.org/jira/browse/SOLR-14636
> Project: Solr
>  Issue Type: Task
>Reporter: Mark Robert Miller
>Assignee: Mark Robert Miller
>Priority: Major
> Attachments: IMG_5575 (1).jpg, jenkins.png, solr-ref-branch.gif
>
>
> SolrCloud powers critical infrastructure and needs the ability to run quickly 
> with stability. This reference implementation will allow for this.
> *location*: [https://github.com/apache/lucene-solr/tree/reference_impl]
> *status*: alpha
> *speed*: ludicrous
> *tests***:
>  * *core*: {color:#00875a}*extremely stable*{color} with 
> *{color:#de350b}ignores{color}*
>  * *solrj*: {color:#00875a}*extremely stable*{color} with 
> {color:#de350b}*ignores*{color}
>  * *test-framework*: *extremely stable* with {color:#de350b}*ignores*{color}
>  * *contrib/analysis-extras*: *extremely stable* with 
> {color:#de350b}*ignores*{color}
>  * *contrib/analytics*: {color:#00875a}*extremely stable*{color} with 
> {color:#de350b}*ignores*{color}
>  * *contrib/clustering*: {color:#00875a}*extremely stable*{color} with 
> *{color:#de350b}ignores{color}*
>  * *contrib/dataimporthandler*: {color:#00875a}*extremely stable*{color} with 
> {color:#de350b}*ignores*{color}
>  * *contrib/dataimporthandler-extras*: {color:#00875a}*extremely 
> stable*{color} with *{color:#de350b}ignores{color}*
>  * *contrib/extraction*: {color:#00875a}*extremely stable*{color} with 
> {color:#de350b}*ignores*{color}
>  * *contrib/jaegertracer-configurator*: {color:#00875a}*extremely 
> stable*{color} with {color:#de350b}*ignores*{color}
>  * *contrib/langid*: {color:#00875a}*extremely stable*{color} with 
> {color:#de350b}*ignores*{color}
>  * *contrib/prometheus-exporter*: {color:#00875a}*extremely stable*{color} 
> with {color:#de350b}*ignores*{color}
>  * *contrib/velocity*: {color:#00875a}*extremely stable*{color} with 
> {color:#de350b}*ignores*{color}
> _* Running tests quickly and efficiently with strict policing will more 
> frequently find bugs and requires a period of hardening._
>  _** Non Nightly currently, Nightly comes last._



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] noblepaul edited a comment on pull request #1730: SOLR-14680: Provide an implementation for the new SolrCluster API

2020-08-10 Thread GitBox


noblepaul edited a comment on pull request #1730:
URL: https://github.com/apache/lucene-solr/pull/1730#issuecomment-671651150


   >what is the target use case of the interface and lazy implementation? 
   
   The objectives are many
   
   -  Totally refactor Solr code base to minimize dependencies on concrete 
classes. This enables us to do simulation and testing, make code more readable, 
and enable refactoring
   - As we move to  a new mode for Solr with a lean core and packages/plugins, 
we want to have less API surface area against which the plugins are written. 
This enables the plugins to work against a wider range of versions without 
rewriting/recompiling
   - The `LazySolrCluster` will be the default impl for these interfaces. 
Because, this is the current behaviour (mostly). We expect fresh data to be 
available all times
   
   The problem with the existing classes implementing the interfaces is that, 
users of the APIs will cast these objects to the underlying concrete classes, 
which defeats the purpose




This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] noblepaul edited a comment on pull request #1730: SOLR-14680: Provide an implementation for the new SolrCluster API

2020-08-10 Thread GitBox


noblepaul edited a comment on pull request #1730:
URL: https://github.com/apache/lucene-solr/pull/1730#issuecomment-671651150


   >what is the target use case of the interface and lazy implementation? 
   
   The objectives are many
   
   - The `LazySolrCluster` will be the default impl for these interfaces. 
Because, this is the current behaviour (mostly). We expect fresh data to be 
available all times
   -  Totally refactor Solr code base to minimize dependencies on concrete 
classes. This enables us to do simulation and testing, make code more readable, 
and enable refactoring
   - As we move to  a new mode for Solr with a lean core and packages/plugins, 
we want to have less API surface area against which the plugins are written. 
This enables the plugins to work against a wider range of versions without 
rewriting/recompiling
   
   The problem with the existing classes implementing the interfaces is that, 
users of the APIs will cast these objects to the underlying concrete classes, 
which defeats the purpose




This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] noblepaul edited a comment on pull request #1730: SOLR-14680: Provide an implementation for the new SolrCluster API

2020-08-10 Thread GitBox


noblepaul edited a comment on pull request #1730:
URL: https://github.com/apache/lucene-solr/pull/1730#issuecomment-671651150


   >what is the target use case of the interface and lazy implementation? 
   
   The objectives are many
   
   -  Totally refactor Solr code base to minimize dependencies on concrete 
classes. This enables us to do simulation and testing, make code more readable, 
and enable refactoring
   - As we move to  a new mode for Solr with a lean core and packages/plugins, 
we want to have less API surface area against which the plugins are written. 
This enables the plugins to work against a wider range of versions without 
rewriting/recompiling
   
   The problem with the existing classes implementing the interfaces is that, 
users of the APIs will cast these objects to the underlying concrete classes, 
which defeats the purpose




This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] noblepaul commented on pull request #1730: SOLR-14680: Provide an implementation for the new SolrCluster API

2020-08-10 Thread GitBox


noblepaul commented on pull request #1730:
URL: https://github.com/apache/lucene-solr/pull/1730#issuecomment-671651150


   >what is the target use case of the interface and lazy implementation? 
   
   The objectives are many
   
   -  Totally refactor Solr code base to minimize dependencies on concrete 
classes. This enables us to do simulation and testing, make code more readable, 
and enable refactoring
   - As we move to  a new mode for Solr with a lean core and packages/plugins, 
we want to have less API surface area against which the plugins are written. 
This enables us the plugins to work against a wider range of versions without 
rewriting/recompiling
   
   The problem with the existing classes implementing the interfaces is that, 
users of the APIs will cast these objects to the underlying concrete classes, 
which defeats the purpose




This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14354) HttpShardHandler send requests in async

2020-08-10 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17175118#comment-17175118
 ] 

Cao Manh Dat commented on SOLR-14354:
-

[~rishisankar] sure, if you can also do the benchmark that Ishan ask, it will 
be even better :D 

> HttpShardHandler send requests in async
> ---
>
> Key: SOLR-14354
> URL: https://issues.apache.org/jira/browse/SOLR-14354
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Fix For: master (9.0), 8.7
>
> Attachments: image-2020-03-23-10-04-08-399.png, 
> image-2020-03-23-10-09-10-221.png, image-2020-03-23-10-12-00-661.png
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> h2. 1. Current approach (problem) of Solr
> Below is the diagram describe the model on how currently handling a request.
> !image-2020-03-23-10-04-08-399.png!
> The main-thread that handles the search requests, will submit n requests (n 
> equals to number of shards) to an executor. So each request will correspond 
> to a thread, after sending a request that thread basically do nothing just 
> waiting for response from other side. That thread will be swapped out and CPU 
> will try to handle another thread (this is called context switch, CPU will 
> save the context of the current thread and switch to another one). When some 
> data (not all) come back, that thread will be called to parsing these data, 
> then it will wait until more data come back. So there will be lots of context 
> switching in CPU. That is quite inefficient on using threads.Basically we 
> want less threads and most of them must busy all the time, because threads 
> are not free as well as context switching. That is the main idea behind 
> everything, like executor
> h2. 2. Async call of Jetty HttpClient
> Jetty HttpClient offers async API like this.
> {code:java}
> httpClient.newRequest("http://domain.com/path;)
> // Add request hooks
> .onRequestQueued(request -> { ... })
> .onRequestBegin(request -> { ... })
> // Add response hooks
> .onResponseBegin(response -> { ... })
> .onResponseHeaders(response -> { ... })
> .onResponseContent((response, buffer) -> { ... })
> .send(result -> { ... }); {code}
> Therefore after calling {{send()}} the thread will return immediately without 
> any block. Then when the client received the header from other side, it will 
> call {{onHeaders()}} listeners. When the client received some {{byte[]}} (not 
> all response) from the data it will call {{onContent(buffer)}} listeners. 
> When everything finished it will call {{onComplete}} listeners. One main 
> thing that will must notice here is all listeners should finish quick, if the 
> listener block, all further data of that request won’t be handled until the 
> listener finish.
> h2. 3. Solution 1: Sending requests async but spin one thread per response
>  Jetty HttpClient already provides several listeners, one of them is 
> InputStreamResponseListener. This is how it is get used
> {code:java}
> InputStreamResponseListener listener = new InputStreamResponseListener();
> client.newRequest(...).send(listener);
> // Wait for the response headers to arrive
> Response response = listener.get(5, TimeUnit.SECONDS);
> if (response.getStatus() == 200) {
>   // Obtain the input stream on the response content
>   try (InputStream input = listener.getInputStream()) {
> // Read the response content
>   }
> } {code}
> In this case, there will be 2 thread
>  * one thread trying to read the response content from InputStream
>  * one thread (this is a short-live task) feeding content to above 
> InputStream whenever some byte[] is available. Note that if this thread 
> unable to feed data into InputStream, this thread will wait.
> By using this one, the model of HttpShardHandler can be written into 
> something like this
> {code:java}
> handler.sendReq(req, (is) -> {
>   executor.submit(() ->
> try (is) {
>   // Read the content from InputStream
> }
>   )
> }) {code}
>  The first diagram will be changed into this
> !image-2020-03-23-10-09-10-221.png!
> Notice that although “sending req to shard1” is wide, it won’t take long time 
> since sending req is a very quick operation. With this operation, handling 
> threads won’t be spin up until first bytes are sent back. Notice that in this 
> approach we still have active threads waiting for more data from InputStream
> h2. 4. Solution 2: Buffering data and handle it inside jetty’s thread.
> Jetty have another listener called BufferingResponseListener. This is how it 
> is get used
> {code:java}
> client.newRequest(...).send(new BufferingResponseListener() {
>   public void onComplete(Result result) {
> try {
>   byte[] response 

[jira] [Commented] (SOLR-13412) Make the Lucene Luke module available from a Solr distribution

2020-08-10 Thread Tomoko Uchida (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17175099#comment-17175099
 ] 

Tomoko Uchida commented on SOLR-13412:
--

FWIW, an Elasticsearch user notified us that he/she created Dockernized version 
Luke. We could revisit this.

[https://github.com/DmitryKey/luke/issues/162]

 

> Make the Lucene Luke module available from a Solr distribution
> --
>
> Key: SOLR-13412
> URL: https://issues.apache.org/jira/browse/SOLR-13412
> Project: Solr
>  Issue Type: Improvement
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
> Attachments: SOLR-13412.patch
>
>
> Now that [~Tomoko Uchida] has put in a great effort to bring Luke into the 
> project, I think it would be good to be able to access it from a Solr distro.
> I want to go to the right place under the Solr install directory and start 
> Luke up to examine the local indexes. 
> This ticket is explicitly _not_ about accessing it from the admin UI, Luke is 
> a stand-alone app that must be invoked on the node that has a Lucene index on 
> the local filesystem
> We need to 
>  * have it included in Solr when running "ant package". 
>  * add some bits to the ref guide on how to invoke
>  ** Where to invoke it from
>  ** mention anything that has to be installed.
>  ** any other "gotchas" someone just installing Solr should be aware of.
>  * Ant should not be necessary.
>  * 
>  
> I'll assign this to myself to keep track of, but would not be offended in the 
> least if someone with more knowledge of "ant package" and the like wanted to 
> take it over ;)
> If we can do it at all



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] madrob opened a new pull request #1734: LUCENE-9453 Add sync around volatile write

2020-08-10 Thread GitBox


madrob opened a new pull request #1734:
URL: https://github.com/apache/lucene-solr/pull/1734


   checkoutAndBlock is not synchronized, but has a non-atomic write to 
numPending. Meanwhile, all of the other writes to numPending are in sync 
methods.
   
   In this case it turns out to be ok because all of the code paths calling 
this method are already sync:
   
   `synchronized doAfterDocument -> checkout -> checkoutAndBlock`
   `checkoutLargestNonPendingWriter -> synchronized(this) -> checkout -> 
checkoutAndBlock`
   
   If we make synchronized checkoutAndBlock that protects us against future 
changes, shouldn't cause any performance impact since the code paths will 
already be going through a sync block, and will make an IntelliJ warning go 
away.
   
   Found via IntelliJ warnings. 
   
   https://issues.apache.org/jira/browse/LUCENE-9453
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] madrob commented on a change in pull request #1732: Clean up many small fixes

2020-08-10 Thread GitBox


madrob commented on a change in pull request #1732:
URL: https://github.com/apache/lucene-solr/pull/1732#discussion_r468227099



##
File path: 
lucene/core/src/java/org/apache/lucene/index/DocumentsWriterFlushControl.java
##
@@ -324,12 +324,12 @@ synchronized void doOnAbort(DocumentsWriterPerThread 
perThread) {
 }
   }
 
-  private void checkoutAndBlock(DocumentsWriterPerThread perThread) {
+  private synchronized void checkoutAndBlock(DocumentsWriterPerThread 
perThread) {

Review comment:
   https://issues.apache.org/jira/browse/LUCENE-9453 I explain in that 
issue why I believe it is minor, but it will help to get more eyes on it





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-9453) DocumentWriterFlushControl missing explicit sync on write

2020-08-10 Thread Mike Drob (Jira)
Mike Drob created LUCENE-9453:
-

 Summary: DocumentWriterFlushControl missing explicit sync on write
 Key: LUCENE-9453
 URL: https://issues.apache.org/jira/browse/LUCENE-9453
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Reporter: Mike Drob


checkoutAndBlock is not synchronized, but has a non-atomic write to 
{{numPending}}. Meanwhile, all of the other writes to numPending are in sync 
methods.

In this case it turns out to be ok because all of the code paths calling this 
method are already sync:

{{synchronized doAfterDocument -> checkout -> checkoutAndBlock}}
{{checkoutLargestNonPendingWriter -> synchronized(this) -> checkout -> 
checkoutAndBlock}}

If we make {{synchronized checkoutAndBlock}} that protects us against future 
changes, shouldn't cause any performance impact since the code paths will 
already be going through a sync block, and will make an IntelliJ warning go 
away.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] madrob commented on a change in pull request #1732: Clean up many small fixes

2020-08-10 Thread GitBox


madrob commented on a change in pull request #1732:
URL: https://github.com/apache/lucene-solr/pull/1732#discussion_r468221630



##
File path: lucene/core/src/java/org/apache/lucene/index/DocValuesUpdate.java
##
@@ -152,12 +152,12 @@ static BytesRef readFrom(DataInput in, BytesRef scratch) 
throws IOException {
 }
 
 NumericDocValuesUpdate(Term term, String field, Long value) {
-  this(term, field, value != null ? value.longValue() : -1, 
BufferedUpdates.MAX_INT, value != null);
+  this(term, field, value != null ? value : -1, BufferedUpdates.MAX_INT, 
value != null);
 }
 
 
-private NumericDocValuesUpdate(Term term, String field, long value, int 
docIDUpTo, boolean hasValue) {

Review comment:
   There were 16 instances of `Upto` and 4 of `UpTo` so I went with the 
more common one for consistency. Happy to switch the other way if it's more 
correct according to English. Looking it up now and looks like "upto" isn't a 
word?

##
File path: 
lucene/core/src/java/org/apache/lucene/index/DocumentsWriterFlushControl.java
##
@@ -324,12 +324,12 @@ synchronized void doOnAbort(DocumentsWriterPerThread 
perThread) {
 }
   }
 
-  private void checkoutAndBlock(DocumentsWriterPerThread perThread) {
+  private synchronized void checkoutAndBlock(DocumentsWriterPerThread 
perThread) {

Review comment:
   I'll split this out.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14687) Make child/parent query parsers natively aware of _nest_path_

2020-08-10 Thread Chris M. Hostetter (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17175086#comment-17175086
 ] 

Chris M. Hostetter commented on SOLR-14687:
---

besides that fact that Jira's WYSIWYG editor lied to me and munged up some of 
the formatting of "STAR:STAR" and "UNDERSCORE nest UNDERSCORE path UNDERSCORE" 
in many places, something else has been nagging that i felt like i was 
overlooking and i finally figured out what it is: I hadn't really accounted for 
docs that _have_ a "nest path" but their path doesn't have any common ancestors 
with the {{parentPath}} specified – ie: how would a mix of {{/a/b/c}} hierarchy 
docs mixed in an index with docs having a hierarchy of {{/x/y/z}} wind up 
affecting each other?

I *think* that what i described above would still mostly work for the "parent" 
parser – even if the "parent filter" generated by a {{parentPath="/a/b/c"}} as 
i described above didn't really "rule out" the other docs, because this still 
wouldn't match the "nest path with a prefix of /a/b/c" rule for the "children", 
but it still wouldn't really be a "correct" "parents bit set filter" as the 
underlying code expects it to be in terms of identifying all "non children" 
documents ... but** I'm _pretty sure_ it would be broken for the "child" parser 
case, because some doc with a n "/x" or  "/x/y" path isn't going to be matched 
by the "parents filter bitset" so might get swallowed up in the list of 
children.

The other thing that bugged me was the (mistaken & missguided) need to ' ... 
compute a list of all "prefix subpaths" ... ' – i'm not sure way i thought that 
was necessary, instead of just saying "must _NOT_ have a prefix of the 
specified path – ie:
{code:java}
 GIVEN:{!foo parentPath="/a/b/c"} ...

INSTEAD OF:PARENT FILTER BITSET = ((*:* -_nest_path_:*) OR _nest_path_:(/a 
/a/b /a/b/c))

  JUST USE:PARENT FILTER BITSET = (*:* -{prefix f="_nest_path_" 
v="/a/b/c/"}) {code}
...which (IIUC) should solve both problems, by matching:
 * docs w/o any nest path
 * docs with a nest path that does NOT start with /a/b/c/
 ** which includes the immediate "/a/b/c" parents, as well as their ancestors, 
as well as any docs with completely orthoginal paths (like /x/y/z)

But of course: in the case of {{parentFilter="/"}} this would still simply be 
"docs w/o a nest path"

That should work, right?

I also think i made some mistakes/types in my examples above in trying to 
articular what the equivalent "old style" query would be, so let me restate all 
of the examples in full...
{noformat}
NEW:  q={!parent parentPath="/a/b/c"}c_title:son

OLD:  q=(+{!field f="_nest_path_" v="/a/b/c"} +{!parent which=$ff v=$vv})
 ff=(*:* -{prefix f="_nest_path_" v="/a/b/c/"}) 
 vv=(+c_title:son +{prefix f="_nest_path_" v="/a/b/c/"})
{noformat}
{noformat}
NEW:  q={!parent parentPath="/"}c_title:son

OLD:  q=(-_nest_path_:* +{!parent which=$ff v=$vv}
 ff=(*:* -_nest_path_:*) 
 vv=(+c_title:son +_nest_path_:*)
{noformat}
{noformat}
NEW:  q={!child parentPath="/a/b/c"}p_title:dad

OLD:  q={!child of=$ff v=$vv})
 ff=(*:* -{prefix f="_nest_path_" v="/a/b/c/"}) 
 vv=(+p_title:dad +{field f="_nest_path_" v="/a/b/c"})
{noformat}
{noformat}
NEW:  q={!child parentPath="/"}p_title:dad

OLD:  q={!child of=$ff v=$vv})
 ff=(*:* -_nest_path_:*) 
 vv=(+p_title:dad +_nest_path_:*)
{noformat}
 

[~mkhl] - what do you think about this approach? do you see any flaws in the 
logic here? ... if the logic looks correct, I'd like to write it up as "how to 
create a *safe* of/which local param when using nest path" doc tip for 
SOLR-14383 and move forward there as a documentation improvement, even if there 
are still feature/implementation/syntax concerns/discussion to happen here as 
far as a "new feature"

 

> Make child/parent query parsers natively aware of _nest_path_
> -
>
> Key: SOLR-14687
> URL: https://issues.apache.org/jira/browse/SOLR-14687
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Chris M. Hostetter
>Priority: Major
>
> A long standing pain point of the parent/child QParsers is the "all parents" 
> bitmask/filter specified via the "which" and "of" params (respectively).
> This is particularly tricky/painful to "get right" when dealing with 
> multi-level nested documents...
>  * 
> https://issues.apache.org/jira/browse/SOLR-14383?focusedCommentId=17166339=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17166339
>  * 
> [https://lists.apache.org/thread.html/r7633a366dd76e7ce9d98e6b9f2a65da8af8240e846f789d938c8113f%40%3Csolr-user.lucene.apache.org%3E]
> ...and it's *really* hard to get right when the nested structure isn't 100% 
> consistent among all docs:
>  * collections that mix docs w/o children and docs that have 

[GitHub] [lucene-solr] madrob commented on a change in pull request #1732: Clean up many small fixes

2020-08-10 Thread GitBox


madrob commented on a change in pull request #1732:
URL: https://github.com/apache/lucene-solr/pull/1732#discussion_r468220684



##
File path: 
lucene/core/src/java/org/apache/lucene/codecs/blocktree/BlockTreeTermsWriter.java
##
@@ -709,7 +709,7 @@ private PendingBlock writeBlock(int prefixLength, boolean 
isFloor, int floorLead
 
   PendingTerm term = (PendingTerm) ent;
 
-  assert StringHelper.startsWith(term.termBytes, prefix): "term.term=" 
+ term.termBytes + " prefix=" + prefix;
+  assert StringHelper.startsWith(term.termBytes, prefix): "term.term=" 
+ new String(term.termBytes) + " prefix=" + prefix;

Review comment:
   Are these UTF-8? I wasn't sure, and hoped somebody would let me know 
during review.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] madrob commented on a change in pull request #1732: Clean up many small fixes

2020-08-10 Thread GitBox


madrob commented on a change in pull request #1732:
URL: https://github.com/apache/lucene-solr/pull/1732#discussion_r468220535



##
File path: lucene/core/src/java/org/apache/lucene/analysis/Analyzer.java
##
@@ -94,7 +94,7 @@
* Create a new Analyzer, reusing the same set of components per-thread
* across calls to {@link #tokenStream(String, Reader)}. 
*/
-  public Analyzer() {

Review comment:
   I understand that it's notionally an API change, but `abstract` classes 
have no reason for public constructors. We can make everything protected and 
the subclasses that people use will be able to pick it up. I was over-zealous 
in a couple places going to package instead of protected, I'll fix that up.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8776) Start offset going backwards has a legitimate purpose

2020-08-10 Thread Roman (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17175081#comment-17175081
 ] 

Roman commented on LUCENE-8776:
---

I too suffer from the same issue, we have multi-token synonyms that can even 
overlap. I recognize the arguments against the backward offsets but I find them 
surprisingly backwards: they are saying that the implementation dictates 
function. When the function is what (for many people) is the goal. The 
arguments seem also to say that the most efficient implementation (non-negative 
integer deltas) does not allow backward offsets, therefore backward offsets is 
a bug. 

Please recognize, that the most elegant implementation sometimes mean "as 
complex as needed" – it is not the same as "the simplest". If negative vints 
consume 5 bytes instead of 4, some people need to and are willing to pay that 
price. Their use cases cannot be simply 'boxed' into the world where one is 
only looking ahead and never back (NLP is one such world)

Lucene is however inviting one particular solution:

The implementation of vint seems not mind if there is a negative offset 
(https://issues.apache.org/jira/browse/LUCENE-3738) and DefaultIndexingChain 
extends DocConsumer – the name 'Default' suggests that at some point in the 
past, Lucene developers wanted to provide other implementations. As it is 
*right now*, it is not easy to plug in a different 'DocConsumer' – that surely 
seems like an important omission! (one size fits all?). 

So if we just add a simple mechanism to instruct Lucene which DocConsumer to 
use, then all could be happy and not have to resort to dirty hacks or forks. 
The most efficient impl will be the default, yet will allow us us - dirty 
bastards - shoot ourselves in foot if we so desire. SOLR as well as 
ElasticSearch devs might not mind having the option in the future - it can come 
in handy. Wouldn't that be wonderful? Well, wonderful certainly not, just 
useful... could I do it? [~rcmuir] [~mikemccand] [~simonw]

 

 

 

> Start offset going backwards has a legitimate purpose
> -
>
> Key: LUCENE-8776
> URL: https://issues.apache.org/jira/browse/LUCENE-8776
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 7.6
>Reporter: Ram Venkat
>Priority: Major
>
> Here is the use case where startOffset can go backwards:
> Say there is a line "Organic light-emitting-diode glows", and I want to run 
> span queries and highlight them properly. 
> During index time, light-emitting-diode is split into three words, which 
> allows me to search for 'light', 'emitting' and 'diode' individually. The 
> three words occupy adjacent positions in the index, as 'light' adjacent to 
> 'emitting' and 'light' at a distance of two words from 'diode' need to match 
> this word. So, the order of words after splitting are: Organic, light, 
> emitting, diode, glows. 
> But, I also want to search for 'organic' being adjacent to 
> 'light-emitting-diode' or 'light-emitting-diode' being adjacent to 'glows'. 
> The way I solved this was to also generate 'light-emitting-diode' at two 
> positions: (a) In the same position as 'light' and (b) in the same position 
> as 'glows', like below:
> ||organic||light||emitting||diode||glows||
> | |light-emitting-diode| |light-emitting-diode| |
> |0|1|2|3|4|
> The positions of the two 'light-emitting-diode' are 1 and 3, but the offsets 
> are obviously the same. This works beautifully in Lucene 5.x in both 
> searching and highlighting with span queries. 
> But when I try this in Lucene 7.6, it hits the condition "Offsets must not go 
> backwards" at DefaultIndexingChain:818. This IllegalArgumentException is 
> being thrown without any comments on why this check is needed. As I explained 
> above, startOffset going backwards is perfectly valid, to deal with word 
> splitting and span operations on these specialized use cases. On the other 
> hand, it is not clear what value is added by this check and which highlighter 
> code is affected by offsets going backwards. This same check is done at 
> BaseTokenStreamTestCase:245. 
> I see others talk about how this check found bugs in WordDelimiter etc. but 
> it also prevents legitimate use cases. Can this check be removed?  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] gautamworah96 opened a new pull request #1733: LUCENE-9450 Use BinaryDocValues in the taxonomy writer

2020-08-10 Thread GitBox


gautamworah96 opened a new pull request #1733:
URL: https://github.com/apache/lucene-solr/pull/1733


   
   
   
   # Description
   
   This PR modifies the taxonomy writer and reader implementation to use 
BinaryDocValues instead of StoredValues. 
   The taxonomy index uses stored fields today and must do a number of stored 
field lookups for each query to resolve taxonomy ordinals back to human 
presentable facet labels.
   
   # Solution
   
   Change the storage format to use DocValues
   
   # Tests
   
   ant test fails because
   `.binaryValue()` returns a `NullPointerException`
   
   To reproduce the error:
   `ant test  -Dtestcase=TestExpressionAggregationFacetsExample 
-Dtests.method=testSimple -Dtests.seed=4544BD51622879A4 -Dtests.slow=true 
-Dtests.badapples=true -Dtests.locale=si 
-Dtests.timezone=Antarctica/DumontDUrville -Dtests.asserts=true 
-Dtests.file.encoding=US-ASCII`
   
   gives
   
   ```nit4:pickseed] Seed property 'tests.seed' already defined: 
4544BD51622879A4
   [mkdir] Created dir: 
/Users/gauworah/opensource/mystuff/lucene-solr/lucene/build/demo/test/temp
  [junit4]  says Привет! Master seed: 4544BD51622879A4
  [junit4] Executing 1 suite with 1 JVM.
  [junit4] 
  [junit4] Started J0 PID(76859@localhost).
  [junit4] Suite: 
org.apache.lucene.demo.facet.TestExpressionAggregationFacetsExample
  [junit4]   2> NOTE: reproduce with: ant test  
-Dtestcase=TestExpressionAggregationFacetsExample -Dtests.method=testSimple 
-Dtests.seed=4544BD51622879A4 -Dtests.slow=true -Dtests.badapples=true 
-Dtests.locale=si -Dtests.timezone=Antarctica/DumontDUrville 
-Dtests.asserts=true -Dtests.file.encoding=US-ASCII
  [junit4] ERROR   0.61s | 
TestExpressionAggregationFacetsExample.testSimple <<<
  [junit4]> Throwable #1: java.lang.NullPointerException
  [junit4]>at 
__randomizedtesting.SeedInfo.seed([4544BD51622879A4:7DF799AF45DBAD75]:0)
  [junit4]>at 
org.apache.lucene.index.MultiDocValues$3.binaryValue(MultiDocValues.java:403)
  [junit4]>at 
org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyReader.getPath(DirectoryTaxonomyReader.java:328)
  [junit4]>at 
org.apache.lucene.facet.taxonomy.FloatTaxonomyFacets.getTopChildren(FloatTaxonomyFacets.java:151)
  [junit4]>at 
org.apache.lucene.demo.facet.ExpressionAggregationFacetsExample.search(ExpressionAggregationFacetsExample.java:107)
  [junit4]>at 
org.apache.lucene.demo.facet.ExpressionAggregationFacetsExample.runSearch(ExpressionAggregationFacetsExample.java:118)
  [junit4]>at 
org.apache.lucene.demo.facet.TestExpressionAggregationFacetsExample.testSimple(TestExpressionAggregationFacetsExample.java:28)
  [junit4]>at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  [junit4]>at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  [junit4]>at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  [junit4]>at 
java.base/java.lang.reflect.Method.invoke(Method.java:567)
  [junit4]>at java.base/java.lang.Thread.run(Thread.java:830)
   ```
   
   3 other tests also fail at the same line
   # Checklist
   
   Please review the following and check all that apply:
   
   - [x] I have reviewed the guidelines for [How to 
Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms 
to the standards described there to the best of my ability.
   - [x] I have created a Jira issue and added the issue ID to my pull request 
title.
   - [x] I have given Solr maintainers 
[access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork)
 to contribute to my PR branch. (optional but recommended)
   - [x] I have developed this patch against the `master` branch.
   - [ ] I have run `ant precommit` and the appropriate test suite.
   - [ ] I have added tests for my changes.
   - [ ] I have added documentation for the [Ref 
Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) 
(for Solr changes only).
   
   **This is a draft PR**



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13528) Rate limiting in Solr

2020-08-10 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17175072#comment-17175072
 ] 

ASF subversion and git services commented on SOLR-13528:


Commit 424a9a6cfc64476b8d3fbee4f38733ffcb297f7c in lucene-solr's branch 
refs/heads/master from Cassandra Targett
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=424a9a6 ]

SOLR-13528: fix heading levels


> Rate limiting in Solr
> -
>
> Key: SOLR-13528
> URL: https://issues.apache.org/jira/browse/SOLR-13528
> Project: Solr
>  Issue Type: New Feature
>Reporter: Anshum Gupta
>Assignee: Atri Sharma
>Priority: Major
>  Time Spent: 9h 40m
>  Remaining Estimate: 0h
>
> In relation to SOLR-13527, Solr also needs a way to throttle update and 
> search requests based on usage metrics. This is the umbrella JIRA for both 
> update and search rate limiting.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-13528) Rate limiting in Solr

2020-08-10 Thread Cassandra Targett (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17175070#comment-17175070
 ] 

Cassandra Targett edited comment on SOLR-13528 at 8/10/20, 9:34 PM:


The Ref Guide docs in this commit were throwing some errors in the build (not 
failing it though) about inconsistent heading levels, which I only noticed 
because I was working on fixing our new Jenkins builds.

I'm about to make a commit to fix that, but I noticed that the headings in 
question are all parameters users can configure. That's not generally how we 
structure parameters, especially when there is not a lot of text for each one 
(we have a couple examples of this being done, but I will someday get around to 
changing those to be like the majority of the parameters throughout the Guide, 
like in 
https://lucene.apache.org/solr/guide/8_6/detecting-languages-during-indexing.html#langid-parameters).

I'm also uncomfortable with how the example parameters are shown (as separate 
source blocks). I think it might be simpler and more instructive for readers to 
restructure this a bit, to show a full configuration of the 
{{SolrRequestFilter}}, with all the parameters in a single block. Users can 
then copy/paste it more easily and will be less likely to miss a parameter. As 
it stands, I have to do a bit of mental gymnastics to figure out where this 
needs to go (full path to the file) and what it should look like.

I'm fine doing this myself, I just wanted to let you know what I am going to do 
and why. Of course, if you'd like to do it, I'll be happy to let you.


was (Author: ctargett):
The Ref Guide docs in this commit were throwing some errors in the build (not 
failing it though) about inconsistent heading levels, which I only noticed 
because I was working on fixing our new Jenkins builds.

I'm about to make a commit to fix that, but I noticed that the headings in 
question are all parameters users can configure. That's not generally how we 
structure parameters, especially when there is not a lot of text for each one 
(we have a couple examples of this being done, but I will someday get around to 
changing those to be like the majority of the parameters throughout the Guide, 
like in 
https://lucene.apache.org/solr/guide/8_6/detecting-languages-during-indexing.html#langid-parameters).

I'm also uncomfortable with how the example parameters are shown (as separate 
source blocks). I think it might be simpler and more instructive for readers to 
restructure this a bit, to show a full configuration of the 
{{SolrRequestFilter}}, with all the parameters in a single block. Users can 
then copy/paste it more easily and will be less likely to miss a parameter.

I'm fine doing this myself, I just wanted to let you know what I am going to do 
and why. Of course, if you'd like to do it, I'll be happy to let you.

> Rate limiting in Solr
> -
>
> Key: SOLR-13528
> URL: https://issues.apache.org/jira/browse/SOLR-13528
> Project: Solr
>  Issue Type: New Feature
>Reporter: Anshum Gupta
>Assignee: Atri Sharma
>Priority: Major
>  Time Spent: 9h 40m
>  Remaining Estimate: 0h
>
> In relation to SOLR-13527, Solr also needs a way to throttle update and 
> search requests based on usage metrics. This is the umbrella JIRA for both 
> update and search rate limiting.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13528) Rate limiting in Solr

2020-08-10 Thread Cassandra Targett (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17175070#comment-17175070
 ] 

Cassandra Targett commented on SOLR-13528:
--

The Ref Guide docs in this commit were throwing some errors in the build (not 
failing it though) about inconsistent heading levels, which I only noticed 
because I was working on fixing our new Jenkins builds.

I'm about to make a commit to fix that, but I noticed that the headings in 
question are all parameters users can configure. That's not generally how we 
structure parameters, especially when there is not a lot of text for each one 
(we have a couple examples of this being done, but I will someday get around to 
changing those to be like the majority of the parameters throughout the Guide, 
like in 
https://lucene.apache.org/solr/guide/8_6/detecting-languages-during-indexing.html#langid-parameters).

I'm also uncomfortable with how the example parameters are shown (as separate 
source blocks). I think it might be simpler and more instructive for readers to 
restructure this a bit, to show a full configuration of the 
{{SolrRequestFilter}}, with all the parameters in a single block. Users can 
then copy/paste it more easily and will be less likely to miss a parameter.

I'm fine doing this myself, I just wanted to let you know what I am going to do 
and why. Of course, if you'd like to do it, I'll be happy to let you.

> Rate limiting in Solr
> -
>
> Key: SOLR-13528
> URL: https://issues.apache.org/jira/browse/SOLR-13528
> Project: Solr
>  Issue Type: New Feature
>Reporter: Anshum Gupta
>Assignee: Atri Sharma
>Priority: Major
>  Time Spent: 9h 40m
>  Remaining Estimate: 0h
>
> In relation to SOLR-13527, Solr also needs a way to throttle update and 
> search requests based on usage metrics. This is the umbrella JIRA for both 
> update and search rate limiting.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-9452) Remove jenkins.build.ref.guide.sh

2020-08-10 Thread Cassandra Targett (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cassandra Targett resolved LUCENE-9452.
---
Fix Version/s: 8.7
   Resolution: Fixed

I only backported to branch_8x, but could also remove it from branch_8_6 if 
it's necessary to do so.

> Remove jenkins.build.ref.guide.sh
> -
>
> Key: LUCENE-9452
> URL: https://issues.apache.org/jira/browse/LUCENE-9452
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: general/build
>Reporter: Cassandra Targett
>Assignee: Cassandra Targett
>Priority: Major
> Fix For: 8.7
>
>
> After the move to Cloudbees (ci-builds.apache.org), the Ref Guide Jenkins 
> jobs stopped working. The {{dev-tools/scripts/jenkins.build.ref.guide.sh}} 
> script we used to build the Guide installed its own RVM and gemset for the 
> required gems to run with the Ant build and it was difficult to get the paths 
> right. Infra added the dependencies that we need to their Puppet-managed node 
> deploy process (see INFRA-20656) and now we don't need a script to do any of 
> that for us.
> This issue is to track removing the script since it's no longer required. The 
> Ref Guide build jobs will just invoke Ant directly instead.
> IIUC from SOLR-10568 when the script was added, there might still come a day 
> when there is a version mismatch between what was installed by default and 
> what our build needs, but I think it's fair to try to work with Infra to get 
> our needs met on the nodes instead of adding them to a script which makes 
> migration like this more complex.
> All of these pre-build dependencies go away, however, when we move to Gradle, 
> so even if we have a version mismatch one time it won't be a persistent issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9452) Remove jenkins.build.ref.guide.sh

2020-08-10 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17175051#comment-17175051
 ] 

ASF subversion and git services commented on LUCENE-9452:
-

Commit c5c1f43c0effaa2647312cc6ce8e5704bead020f in lucene-solr's branch 
refs/heads/branch_8x from Cassandra Targett
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=c5c1f43 ]

LUCENE-9452: remove jenkins.build.ref.guide.sh as it's no longer needed


> Remove jenkins.build.ref.guide.sh
> -
>
> Key: LUCENE-9452
> URL: https://issues.apache.org/jira/browse/LUCENE-9452
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: general/build
>Reporter: Cassandra Targett
>Assignee: Cassandra Targett
>Priority: Major
>
> After the move to Cloudbees (ci-builds.apache.org), the Ref Guide Jenkins 
> jobs stopped working. The {{dev-tools/scripts/jenkins.build.ref.guide.sh}} 
> script we used to build the Guide installed its own RVM and gemset for the 
> required gems to run with the Ant build and it was difficult to get the paths 
> right. Infra added the dependencies that we need to their Puppet-managed node 
> deploy process (see INFRA-20656) and now we don't need a script to do any of 
> that for us.
> This issue is to track removing the script since it's no longer required. The 
> Ref Guide build jobs will just invoke Ant directly instead.
> IIUC from SOLR-10568 when the script was added, there might still come a day 
> when there is a version mismatch between what was installed by default and 
> what our build needs, but I think it's fair to try to work with Infra to get 
> our needs met on the nodes instead of adding them to a script which makes 
> migration like this more complex.
> All of these pre-build dependencies go away, however, when we move to Gradle, 
> so even if we have a version mismatch one time it won't be a persistent issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9452) Remove jenkins.build.ref.guide.sh

2020-08-10 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17175050#comment-17175050
 ] 

ASF subversion and git services commented on LUCENE-9452:
-

Commit a747051c6ae348a7f16cf684e4de6b49e27fc5c3 in lucene-solr's branch 
refs/heads/master from Cassandra Targett
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=a747051 ]

LUCENE-9452: remove jenkins.build.ref.guide.sh as it's no longer needed


> Remove jenkins.build.ref.guide.sh
> -
>
> Key: LUCENE-9452
> URL: https://issues.apache.org/jira/browse/LUCENE-9452
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: general/build
>Reporter: Cassandra Targett
>Assignee: Cassandra Targett
>Priority: Major
>
> After the move to Cloudbees (ci-builds.apache.org), the Ref Guide Jenkins 
> jobs stopped working. The {{dev-tools/scripts/jenkins.build.ref.guide.sh}} 
> script we used to build the Guide installed its own RVM and gemset for the 
> required gems to run with the Ant build and it was difficult to get the paths 
> right. Infra added the dependencies that we need to their Puppet-managed node 
> deploy process (see INFRA-20656) and now we don't need a script to do any of 
> that for us.
> This issue is to track removing the script since it's no longer required. The 
> Ref Guide build jobs will just invoke Ant directly instead.
> IIUC from SOLR-10568 when the script was added, there might still come a day 
> when there is a version mismatch between what was installed by default and 
> what our build needs, but I think it's fair to try to work with Infra to get 
> our needs met on the nodes instead of adding them to a script which makes 
> migration like this more complex.
> All of these pre-build dependencies go away, however, when we move to Gradle, 
> so even if we have a version mismatch one time it won't be a persistent issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14354) HttpShardHandler send requests in async

2020-08-10 Thread Rishi Sankar (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17175025#comment-17175025
 ] 

Rishi Sankar commented on SOLR-14354:
-

[~caomanhdat] I am interested in this work as well, I am down to work on this 
if you'd like and do a PR with the API changes David suggested.

> HttpShardHandler send requests in async
> ---
>
> Key: SOLR-14354
> URL: https://issues.apache.org/jira/browse/SOLR-14354
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Fix For: master (9.0), 8.7
>
> Attachments: image-2020-03-23-10-04-08-399.png, 
> image-2020-03-23-10-09-10-221.png, image-2020-03-23-10-12-00-661.png
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> h2. 1. Current approach (problem) of Solr
> Below is the diagram describe the model on how currently handling a request.
> !image-2020-03-23-10-04-08-399.png!
> The main-thread that handles the search requests, will submit n requests (n 
> equals to number of shards) to an executor. So each request will correspond 
> to a thread, after sending a request that thread basically do nothing just 
> waiting for response from other side. That thread will be swapped out and CPU 
> will try to handle another thread (this is called context switch, CPU will 
> save the context of the current thread and switch to another one). When some 
> data (not all) come back, that thread will be called to parsing these data, 
> then it will wait until more data come back. So there will be lots of context 
> switching in CPU. That is quite inefficient on using threads.Basically we 
> want less threads and most of them must busy all the time, because threads 
> are not free as well as context switching. That is the main idea behind 
> everything, like executor
> h2. 2. Async call of Jetty HttpClient
> Jetty HttpClient offers async API like this.
> {code:java}
> httpClient.newRequest("http://domain.com/path;)
> // Add request hooks
> .onRequestQueued(request -> { ... })
> .onRequestBegin(request -> { ... })
> // Add response hooks
> .onResponseBegin(response -> { ... })
> .onResponseHeaders(response -> { ... })
> .onResponseContent((response, buffer) -> { ... })
> .send(result -> { ... }); {code}
> Therefore after calling {{send()}} the thread will return immediately without 
> any block. Then when the client received the header from other side, it will 
> call {{onHeaders()}} listeners. When the client received some {{byte[]}} (not 
> all response) from the data it will call {{onContent(buffer)}} listeners. 
> When everything finished it will call {{onComplete}} listeners. One main 
> thing that will must notice here is all listeners should finish quick, if the 
> listener block, all further data of that request won’t be handled until the 
> listener finish.
> h2. 3. Solution 1: Sending requests async but spin one thread per response
>  Jetty HttpClient already provides several listeners, one of them is 
> InputStreamResponseListener. This is how it is get used
> {code:java}
> InputStreamResponseListener listener = new InputStreamResponseListener();
> client.newRequest(...).send(listener);
> // Wait for the response headers to arrive
> Response response = listener.get(5, TimeUnit.SECONDS);
> if (response.getStatus() == 200) {
>   // Obtain the input stream on the response content
>   try (InputStream input = listener.getInputStream()) {
> // Read the response content
>   }
> } {code}
> In this case, there will be 2 thread
>  * one thread trying to read the response content from InputStream
>  * one thread (this is a short-live task) feeding content to above 
> InputStream whenever some byte[] is available. Note that if this thread 
> unable to feed data into InputStream, this thread will wait.
> By using this one, the model of HttpShardHandler can be written into 
> something like this
> {code:java}
> handler.sendReq(req, (is) -> {
>   executor.submit(() ->
> try (is) {
>   // Read the content from InputStream
> }
>   )
> }) {code}
>  The first diagram will be changed into this
> !image-2020-03-23-10-09-10-221.png!
> Notice that although “sending req to shard1” is wide, it won’t take long time 
> since sending req is a very quick operation. With this operation, handling 
> threads won’t be spin up until first bytes are sent back. Notice that in this 
> approach we still have active threads waiting for more data from InputStream
> h2. 4. Solution 2: Buffering data and handle it inside jetty’s thread.
> Jetty have another listener called BufferingResponseListener. This is how it 
> is get used
> {code:java}
> client.newRequest(...).send(new BufferingResponseListener() {
>   public void onComplete(Result 

[jira] [Created] (LUCENE-9452) Remove jenkins.build.ref.guide.sh

2020-08-10 Thread Cassandra Targett (Jira)
Cassandra Targett created LUCENE-9452:
-

 Summary: Remove jenkins.build.ref.guide.sh
 Key: LUCENE-9452
 URL: https://issues.apache.org/jira/browse/LUCENE-9452
 Project: Lucene - Core
  Issue Type: Improvement
  Components: general/build
Reporter: Cassandra Targett
Assignee: Cassandra Targett


After the move to Cloudbees (ci-builds.apache.org), the Ref Guide Jenkins jobs 
stopped working. The {{dev-tools/scripts/jenkins.build.ref.guide.sh}} script we 
used to build the Guide installed its own RVM and gemset for the required gems 
to run with the Ant build and it was difficult to get the paths right. Infra 
added the dependencies that we need to their Puppet-managed node deploy process 
(see INFRA-20656) and now we don't need a script to do any of that for us.

This issue is to track removing the script since it's no longer required. The 
Ref Guide build jobs will just invoke Ant directly instead.

IIUC from SOLR-10568 when the script was added, there might still come a day 
when there is a version mismatch between what was installed by default and what 
our build needs, but I think it's fair to try to work with Infra to get our 
needs met on the nodes instead of adding them to a script which makes migration 
like this more complex.

All of these pre-build dependencies go away, however, when we move to Gradle, 
so even if we have a version mismatch one time it won't be a persistent issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dweiss commented on a change in pull request #1732: Clean up many small fixes

2020-08-10 Thread GitBox


dweiss commented on a change in pull request #1732:
URL: https://github.com/apache/lucene-solr/pull/1732#discussion_r468106643



##
File path: 
lucene/core/src/java/org/apache/lucene/codecs/blocktree/BlockTreeTermsWriter.java
##
@@ -709,7 +709,7 @@ private PendingBlock writeBlock(int prefixLength, boolean 
isFloor, int floorLead
 
   PendingTerm term = (PendingTerm) ent;
 
-  assert StringHelper.startsWith(term.termBytes, prefix): "term.term=" 
+ term.termBytes + " prefix=" + prefix;
+  assert StringHelper.startsWith(term.termBytes, prefix): "term.term=" 
+ new String(term.termBytes) + " prefix=" + prefix;

Review comment:
   This is wrong, uses default locale.

##
File path: lucene/core/src/java/org/apache/lucene/analysis/Analyzer.java
##
@@ -94,7 +94,7 @@
* Create a new Analyzer, reusing the same set of components per-thread
* across calls to {@link #tokenStream(String, Reader)}. 
*/
-  public Analyzer() {

Review comment:
   Can you not change those scopes in public API classes? This applies here 
and in other places -- protected changed to package-scope for source is not 
really an API-compatible change.

##
File path: 
lucene/core/src/java/org/apache/lucene/index/DocumentsWriterFlushControl.java
##
@@ -324,12 +324,12 @@ synchronized void doOnAbort(DocumentsWriterPerThread 
perThread) {
 }
   }
 
-  private void checkoutAndBlock(DocumentsWriterPerThread perThread) {
+  private synchronized void checkoutAndBlock(DocumentsWriterPerThread 
perThread) {

Review comment:
   These are serious changes... you're adding synchronization on core 
classes. I don't think they should be piggybacked on top of trivial ones - I'm 
sure @s1monw would chip in whether this synchronization here makes sense but 
he'll probably overlook if it's a bulk of trivial changes on top.

##
File path: lucene/core/src/java/org/apache/lucene/index/DocValuesUpdate.java
##
@@ -152,12 +152,12 @@ static BytesRef readFrom(DataInput in, BytesRef scratch) 
throws IOException {
 }
 
 NumericDocValuesUpdate(Term term, String field, Long value) {
-  this(term, field, value != null ? value.longValue() : -1, 
BufferedUpdates.MAX_INT, value != null);
+  this(term, field, value != null ? value : -1, BufferedUpdates.MAX_INT, 
value != null);
 }
 
 
-private NumericDocValuesUpdate(Term term, String field, long value, int 
docIDUpTo, boolean hasValue) {

Review comment:
   previous version was correct camel case (upTo).





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] madrob opened a new pull request #1732: Clean up many small fixes

2020-08-10 Thread GitBox


madrob opened a new pull request #1732:
URL: https://github.com/apache/lucene-solr/pull/1732


   * Abstract classes don't need public constructors since they can only be
 called by subclasses
   * Don't escape html characters in @code tags in javadoc
   * Fixed a few int/long arithmetic
   * Use Arrays.toString instead of implicit byte[].toString



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2822) TimeLimitingCollector starts thread in static {} with no way to stop them

2020-08-10 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-2822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17175004#comment-17175004
 ] 

David Smiley commented on LUCENE-2822:
--

I could imagine an implementation that tracks doc ID advancing (includes those 
not collected) every X docs with tracking how many nanoseconds it took to do it 
so that the X could be adjusted to decide if it should be checked more 
frequently to meet the deadline.

BTW it's unfortunate that ExitableDirectoryReader and TimeLimitingCollector 
don't even refer to each other in their javadocs, nor use the same exception, 
exist in different packages, and track time differently as well.  Not user 
friendly.  ExitableDirectoryReader is used earlier in the search process 
(covering query rewrite of wildcards, which is important), but _I think_ spans 
to nearly the end of collection, since the query should be reading the index 
relating to the final doc collected.  So I wonder if we need a 
TimeLimitingCollector at all?

> TimeLimitingCollector starts thread in static {} with no way to stop them
> -
>
> Key: LUCENE-2822
> URL: https://issues.apache.org/jira/browse/LUCENE-2822
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>Assignee: Simon Willnauer
>Priority: Major
> Fix For: 3.5, 4.0-ALPHA
>
> Attachments: LUCENE-2822.patch, LUCENE-2822.patch, LUCENE-2822.patch, 
> LUCENE-2822.patch
>
>
> See the comment in LuceneTestCase.
> If you even do Class.forName("TimeLimitingCollector") it starts up a thread 
> in a static method, and there isn't a way to kill it.
> This is broken.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8626) standardise test class naming

2020-08-10 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174997#comment-17174997
 ] 

Dawid Weiss commented on LUCENE-8626:
-

bq. Still, without automated enforcement

For LuceneTestCase subclasses an automatic enforcement of this is trivial: add 
a test rule (or before class hook) that checks test class name (it can go up 
the chain of superclasses but doesn't have to). The benefit of doing this vs. 
file name checks is that actual test suites would be verified - not any other 
class that isn't a test suite.

It would also work across all projects. Including those that import 
lucene-test-framework (which may be problematic for people?).

> standardise test class naming
> -
>
> Key: LUCENE-8626
> URL: https://issues.apache.org/jira/browse/LUCENE-8626
> Project: Lucene - Core
>  Issue Type: Test
>Reporter: Christine Poerschke
>Priority: Major
> Attachments: SOLR-12939.01.patch, SOLR-12939.02.patch, 
> SOLR-12939.03.patch, SOLR-12939_hoss_validation_groovy_experiment.patch
>
>
> This was mentioned and proposed on the dev mailing list. Starting this ticket 
> here to start to make it happen?
> History: This ticket was created as 
> https://issues.apache.org/jira/browse/SOLR-12939 ticket and then got 
> JIRA-moved to become https://issues.apache.org/jira/browse/LUCENE-8626 ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14726) Streamline getting started experience

2020-08-10 Thread David Eric Pugh (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174995#comment-17174995
 ] 

David Eric Pugh commented on SOLR-14726:


I notice that on the page where we discuss curl, 
https://lucene.apache.org/solr/guide/8_6/introduction-to-solr-indexing.html#introduction-to-solr-indexing,
 there is a rather random comment about using wget w/ PERL...Thoughts on 
removing this line:

Instead of curl, you can use utilities such as GNU wget 
(http://www.gnu.org/software/wget/) or manage GETs and POSTS with Perl, 
although the command line options will differ.

> Streamline getting started experience
> -
>
> Key: SOLR-14726
> URL: https://issues.apache.org/jira/browse/SOLR-14726
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Ishan Chattopadhyaya
>Priority: Major
>  Labels: newdev
>
> The reference guide Solr tutorial is here:
> https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html
> It needs to be simplified and easy to follow. Also, it should reflect our 
> best practices, that should also be followed in production. I have following 
> suggestions:
> # Make it less verbose. It is too long. On my laptop, it required 35 page 
> downs button presses to get to the bottom of the page!
> # First step of the tutorial should be to enable security (basic auth should 
> suffice).
> # {{./bin/solr start -e cloud}} <-- All references of -e should be removed.
> # All references of {{bin/solr post}} to be replaced with {{curl}}
> # Convert all {{bin/solr create}} references to curl of collection creation 
> commands
> # Add docker based startup instructions.
> # Create a Jupyter Notebook version of the entire tutorial, make it so that 
> it can be easily executed from Google Colaboratory. Here's an example: 
> https://twitter.com/TheSearchStack/status/1289703715981496320
> # Provide downloadable Postman and Insomnia files so that the same tutorial 
> can be executed from those tools. Except for starting Solr, all other steps 
> should be possible to be carried out from those tools.
> # Use V2 APIs everywhere in the tutorial
> # Remove all example modes, sample data (films, tech products etc.), 
> configsets from Solr's distribution (instead let the examples refer to them 
> from github)
> # Remove the post tool from Solr, curl should suffice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] murblanc commented on pull request #1730: SOLR-14680: Provide an implementation for the new SolrCluster API

2020-08-10 Thread GitBox


murblanc commented on pull request #1730:
URL: https://github.com/apache/lucene-solr/pull/1730#issuecomment-671509879


   
   > The current concrete classes do not use/implement these interfaces. These 
interfaces will only be a part of implementations. for instance, the 
`LazySolrCluster` is one of the impl. In the future we should add a couple more
   
   @noble what is the target use case of the interface and lazy implementation? 
I thought your aim was to create interfaces to existing internal classes 
therefore I expected them to implement these interfaces so these interfaces to 
be used in the code replacing the actual classes...
   Maybe it's just me not understading your intention here. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (SOLR-14702) Remove Master and Slave from Code Base and Docs

2020-08-10 Thread Tomas Eduardo Fernandez Lobbe (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomas Eduardo Fernandez Lobbe resolved SOLR-14702.
--
Fix Version/s: 8.7
   master (9.0)
   Resolution: Fixed

Thanks [~marcussorealheis] for driving this and everyone who contributed. This 
has been merged and backported. Lets take any followup tasks as new Jira issues.

> Remove Master and Slave from Code Base and Docs
> ---
>
> Key: SOLR-14702
> URL: https://issues.apache.org/jira/browse/SOLR-14702
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: master (9.0)
>Reporter: Marcus Eagan
>Priority: Critical
> Fix For: master (9.0), 8.7
>
> Attachments: SOLR-14742-testfix.patch
>
>  Time Spent: 17h
>  Remaining Estimate: 0h
>
> Every time I read _master_ and _slave_, I get pissed.
> I think about the last and only time I remember visiting my maternal great 
> grandpa in Alabama at four years old. He was a sharecropper before WWI, where 
> he lost his legs, and then he was back to being a sharecropper somehow after 
> the war. Crazy, I know. I don't know if the world still called his job 
> sharecropping in 1993, but he was basically a slave—in America. He lived in 
> the same shack that his father, and his grandfather (born a slave) lived in 
> down in Alabama. Believe it or not, my dad's (born in 1926) grandfather was 
> actually born a slave, freed shortly after birth by his owner father. I never 
> met him, though. He died in the 40s.
> Anyway, I cannot police all terms in the repo and do not wish to. This 
> master/slave shit is archaic and misleading on technical grounds. Thankfully, 
> there's only a handful of files in code and documentation that still talk 
> about masters and slaves. We should replace all of them.
> There are so many ways to reword it. In fact, unless anyone else objects or 
> wants to do the grunt work to help my stress levels, I will open the pull 
> request myself in effort to make this project and community more inviting to 
> people of all backgrounds and histories. We can have leader/follower, or 
> primary/secondary, but none of this Master/Slave nonsense. I'm sick of the 
> garbage. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14726) Streamline getting started experience

2020-08-10 Thread Ishan Chattopadhyaya (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174951#comment-17174951
 ] 

Ishan Chattopadhyaya commented on SOLR-14726:
-

The main idea behind using curl is not just to let the user be able to post the 
documents. The main benefit I see is that developers for almost any programming 
language can easily understand what is happening with the curl commands and 
likely already know how to achieve the same in his/her language of choice. The 
universal familiarity of curl helps here.

> Streamline getting started experience
> -
>
> Key: SOLR-14726
> URL: https://issues.apache.org/jira/browse/SOLR-14726
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Ishan Chattopadhyaya
>Priority: Major
>  Labels: newdev
>
> The reference guide Solr tutorial is here:
> https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html
> It needs to be simplified and easy to follow. Also, it should reflect our 
> best practices, that should also be followed in production. I have following 
> suggestions:
> # Make it less verbose. It is too long. On my laptop, it required 35 page 
> downs button presses to get to the bottom of the page!
> # First step of the tutorial should be to enable security (basic auth should 
> suffice).
> # {{./bin/solr start -e cloud}} <-- All references of -e should be removed.
> # All references of {{bin/solr post}} to be replaced with {{curl}}
> # Convert all {{bin/solr create}} references to curl of collection creation 
> commands
> # Add docker based startup instructions.
> # Create a Jupyter Notebook version of the entire tutorial, make it so that 
> it can be easily executed from Google Colaboratory. Here's an example: 
> https://twitter.com/TheSearchStack/status/1289703715981496320
> # Provide downloadable Postman and Insomnia files so that the same tutorial 
> can be executed from those tools. Except for starting Solr, all other steps 
> should be possible to be carried out from those tools.
> # Use V2 APIs everywhere in the tutorial
> # Remove all example modes, sample data (films, tech products etc.), 
> configsets from Solr's distribution (instead let the examples refer to them 
> from github)
> # Remove the post tool from Solr, curl should suffice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14726) Streamline getting started experience

2020-08-10 Thread Ishan Chattopadhyaya (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174948#comment-17174948
 ] 

Ishan Chattopadhyaya commented on SOLR-14726:
-

bq. Everyone knows cURL.
Agree with Marcus here. I think curl + jq should be sufficient. Is there 
anything else that the post tool can do which curl can't?

> Streamline getting started experience
> -
>
> Key: SOLR-14726
> URL: https://issues.apache.org/jira/browse/SOLR-14726
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Ishan Chattopadhyaya
>Priority: Major
>  Labels: newdev
>
> The reference guide Solr tutorial is here:
> https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html
> It needs to be simplified and easy to follow. Also, it should reflect our 
> best practices, that should also be followed in production. I have following 
> suggestions:
> # Make it less verbose. It is too long. On my laptop, it required 35 page 
> downs button presses to get to the bottom of the page!
> # First step of the tutorial should be to enable security (basic auth should 
> suffice).
> # {{./bin/solr start -e cloud}} <-- All references of -e should be removed.
> # All references of {{bin/solr post}} to be replaced with {{curl}}
> # Convert all {{bin/solr create}} references to curl of collection creation 
> commands
> # Add docker based startup instructions.
> # Create a Jupyter Notebook version of the entire tutorial, make it so that 
> it can be easily executed from Google Colaboratory. Here's an example: 
> https://twitter.com/TheSearchStack/status/1289703715981496320
> # Provide downloadable Postman and Insomnia files so that the same tutorial 
> can be executed from those tools. Except for starting Solr, all other steps 
> should be possible to be carried out from those tools.
> # Use V2 APIs everywhere in the tutorial
> # Remove all example modes, sample data (films, tech products etc.), 
> configsets from Solr's distribution (instead let the examples refer to them 
> from github)
> # Remove the post tool from Solr, curl should suffice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] murblanc commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for placement plugin interface

2020-08-10 Thread GitBox


murblanc commented on a change in pull request #1684:
URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r468065465



##
File path: 
solr/core/src/java/org/apache/solr/cluster/placement/PlacementPlugin.java
##
@@ -0,0 +1,41 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.cluster.placement;
+
+/**
+ * Implemented by external plugins to control replica placement and movement 
on the search cluster (as well as other things
+ * such as cluster elasticity?) when cluster changes are required (initiated 
elsewhere, most likely following a Collection
+ * API call).
+ */
+public interface PlacementPlugin {

Review comment:
   I believe we should let the plug-in manage this type of requirements 
rather than try to control it by the timing of when configs are passed. If 
there are licences to check the plug-in should cache the result and only 
confirm each time? 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-2822) TimeLimitingCollector starts thread in static {} with no way to stop them

2020-08-10 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-2822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174930#comment-17174930
 ] 

Uwe Schindler edited comment on LUCENE-2822 at 8/10/20, 5:18 PM:
-

In addition, the extra thread will soon be no issue anymore (the new thread 
impl coming with later java versions) - also known as fibers.


was (Author: thetaphi):
In addition, the extra thread will soon be no issue anymore (the new thread 
impl coming with later java versions).

> TimeLimitingCollector starts thread in static {} with no way to stop them
> -
>
> Key: LUCENE-2822
> URL: https://issues.apache.org/jira/browse/LUCENE-2822
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>Assignee: Simon Willnauer
>Priority: Major
> Fix For: 3.5, 4.0-ALPHA
>
> Attachments: LUCENE-2822.patch, LUCENE-2822.patch, LUCENE-2822.patch, 
> LUCENE-2822.patch
>
>
> See the comment in LuceneTestCase.
> If you even do Class.forName("TimeLimitingCollector") it starts up a thread 
> in a static method, and there isn't a way to kill it.
> This is broken.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2822) TimeLimitingCollector starts thread in static {} with no way to stop them

2020-08-10 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-2822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174930#comment-17174930
 ] 

Uwe Schindler commented on LUCENE-2822:
---

In addition, the extra thread will soon be no issue anymore (the new thread 
impl coming with later java versions).

> TimeLimitingCollector starts thread in static {} with no way to stop them
> -
>
> Key: LUCENE-2822
> URL: https://issues.apache.org/jira/browse/LUCENE-2822
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>Assignee: Simon Willnauer
>Priority: Major
> Fix For: 3.5, 4.0-ALPHA
>
> Attachments: LUCENE-2822.patch, LUCENE-2822.patch, LUCENE-2822.patch, 
> LUCENE-2822.patch
>
>
> See the comment in LuceneTestCase.
> If you even do Class.forName("TimeLimitingCollector") it starts up a thread 
> in a static method, and there isn't a way to kill it.
> This is broken.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2822) TimeLimitingCollector starts thread in static {} with no way to stop them

2020-08-10 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-2822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174917#comment-17174917
 ] 

Uwe Schindler commented on LUCENE-2822:
---

Hi,
it depends on the operating system. Nanotime is still offcially a syscall on 
all operating systems, but some implementations of libc make it a volatile read 
from some address space mapped into the process (e.g macos), only falling back 
to syscall if the result can't be trusted. So in general, calling it on every 
hit is still a bad idea if don't rely on it. I'd use some modulo operation and 
maybe call it every 1000 hits.

> TimeLimitingCollector starts thread in static {} with no way to stop them
> -
>
> Key: LUCENE-2822
> URL: https://issues.apache.org/jira/browse/LUCENE-2822
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>Assignee: Simon Willnauer
>Priority: Major
> Fix For: 3.5, 4.0-ALPHA
>
> Attachments: LUCENE-2822.patch, LUCENE-2822.patch, LUCENE-2822.patch, 
> LUCENE-2822.patch
>
>
> See the comment in LuceneTestCase.
> If you even do Class.forName("TimeLimitingCollector") it starts up a thread 
> in a static method, and there isn't a way to kill it.
> This is broken.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2822) TimeLimitingCollector starts thread in static {} with no way to stop them

2020-08-10 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-2822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174461#comment-17174461
 ] 

David Smiley commented on LUCENE-2822:
--

[~uschindler] (or anyone), is System.nanoTime still considered expensive in 
modern JVMs to call once per collected doc?  You commented about it's expense 
above.

Alternatively, nanoTime could be called when the doc collection delta exceeds 
say 100 docs since the last nanoTime check.

I'm digging this old issue up because, where I work, we've got some 
improvements to this utility many years ago relating to dealing with Thread 
starvation under load.  As I look at it, I just don't like this Thread here at 
all, so I'm wondering if we can just remove it instead of enhancing it's 
existing mechanism.

> TimeLimitingCollector starts thread in static {} with no way to stop them
> -
>
> Key: LUCENE-2822
> URL: https://issues.apache.org/jira/browse/LUCENE-2822
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>Assignee: Simon Willnauer
>Priority: Major
> Fix For: 3.5, 4.0-ALPHA
>
> Attachments: LUCENE-2822.patch, LUCENE-2822.patch, LUCENE-2822.patch, 
> LUCENE-2822.patch
>
>
> See the comment in LuceneTestCase.
> If you even do Class.forName("TimeLimitingCollector") it starts up a thread 
> in a static method, and there isn't a way to kill it.
> This is broken.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14726) Streamline getting started experience

2020-08-10 Thread Marcus Eagan (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174436#comment-17174436
 ] 

Marcus Eagan commented on SOLR-14726:
-

bq. My experience from delivering the Solr version of Think Like a Relevance 
Engineer is that MANY MANY people aren't able to install python. They may be on 
Windows, they may not have "Developer" permissions, they may have Python 2 
versus 3, or it's just not something they use at all.

Yeah, if you're not a Python engineer I have watched very skilled engineers 
struggle with Python and understanding pip and virtual environments. I would 
vote absolutely against adding Python even though it is my favorite and 
strongest language by far. 

> Streamline getting started experience
> -
>
> Key: SOLR-14726
> URL: https://issues.apache.org/jira/browse/SOLR-14726
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Ishan Chattopadhyaya
>Priority: Major
>  Labels: newdev
>
> The reference guide Solr tutorial is here:
> https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html
> It needs to be simplified and easy to follow. Also, it should reflect our 
> best practices, that should also be followed in production. I have following 
> suggestions:
> # Make it less verbose. It is too long. On my laptop, it required 35 page 
> downs button presses to get to the bottom of the page!
> # First step of the tutorial should be to enable security (basic auth should 
> suffice).
> # {{./bin/solr start -e cloud}} <-- All references of -e should be removed.
> # All references of {{bin/solr post}} to be replaced with {{curl}}
> # Convert all {{bin/solr create}} references to curl of collection creation 
> commands
> # Add docker based startup instructions.
> # Create a Jupyter Notebook version of the entire tutorial, make it so that 
> it can be easily executed from Google Colaboratory. Here's an example: 
> https://twitter.com/TheSearchStack/status/1289703715981496320
> # Provide downloadable Postman and Insomnia files so that the same tutorial 
> can be executed from those tools. Except for starting Solr, all other steps 
> should be possible to be carried out from those tools.
> # Use V2 APIs everywhere in the tutorial
> # Remove all example modes, sample data (films, tech products etc.), 
> configsets from Solr's distribution (instead let the examples refer to them 
> from github)
> # Remove the post tool from Solr, curl should suffice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] madrob commented on a change in pull request #1731: LUCENE-9451 Sort.rewrite does not always return this when unchanged

2020-08-10 Thread GitBox


madrob commented on a change in pull request #1731:
URL: https://github.com/apache/lucene-solr/pull/1731#discussion_r468031592



##
File path: lucene/core/src/java/org/apache/lucene/search/DoubleValuesSource.java
##
@@ -456,13 +456,16 @@ public String toString() {
 
 @Override
 public SortField rewrite(IndexSearcher searcher) throws IOException {
-  DoubleValuesSortField rewritten = new 
DoubleValuesSortField(producer.rewrite(searcher), reverse);
+  DoubleValuesSource rewrittenSource = producer.rewrite(searcher);
+  if (rewrittenSource == producer) {

Review comment:
   This might be better as an object equality check instead of reference 
equality, I'm not sure.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] madrob opened a new pull request #1731: LUCENE-9451 Sort.rewrite does not always return this when unchanged

2020-08-10 Thread GitBox


madrob opened a new pull request #1731:
URL: https://github.com/apache/lucene-solr/pull/1731


   https://issues.apache.org/jira/browse/LUCENE-9451 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14726) Streamline getting started experience

2020-08-10 Thread David Eric Pugh (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174433#comment-17174433
 ] 

David Eric Pugh commented on SOLR-14726:


My experience from delivering the Solr version of _Think Like a Relevance 
Engineer_ is that MANY MANY people aren't able to install python.  They may be 
on Windows, they may not have "Developer" permissions, they may have Python 2 
versus 3, or it's just not something they use at all.  

In fact, I'm working on stripping out the Python requirement 
(https://github.com/o19s/solr-tmdb#index-tmdb-movies) for the sample data set, 
and hoping to go to ideally using the Solr Admin -> Documents -> File Upload 
feature (though I see it may be tied to the Extracting Request Handler) to load 
a Solr formatted .json file.   

> Streamline getting started experience
> -
>
> Key: SOLR-14726
> URL: https://issues.apache.org/jira/browse/SOLR-14726
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Ishan Chattopadhyaya
>Priority: Major
>  Labels: newdev
>
> The reference guide Solr tutorial is here:
> https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html
> It needs to be simplified and easy to follow. Also, it should reflect our 
> best practices, that should also be followed in production. I have following 
> suggestions:
> # Make it less verbose. It is too long. On my laptop, it required 35 page 
> downs button presses to get to the bottom of the page!
> # First step of the tutorial should be to enable security (basic auth should 
> suffice).
> # {{./bin/solr start -e cloud}} <-- All references of -e should be removed.
> # All references of {{bin/solr post}} to be replaced with {{curl}}
> # Convert all {{bin/solr create}} references to curl of collection creation 
> commands
> # Add docker based startup instructions.
> # Create a Jupyter Notebook version of the entire tutorial, make it so that 
> it can be easily executed from Google Colaboratory. Here's an example: 
> https://twitter.com/TheSearchStack/status/1289703715981496320
> # Provide downloadable Postman and Insomnia files so that the same tutorial 
> can be executed from those tools. Except for starting Solr, all other steps 
> should be possible to be carried out from those tools.
> # Use V2 APIs everywhere in the tutorial
> # Remove all example modes, sample data (films, tech products etc.), 
> configsets from Solr's distribution (instead let the examples refer to them 
> from github)
> # Remove the post tool from Solr, curl should suffice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14726) Streamline getting started experience

2020-08-10 Thread Marcus Eagan (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174434#comment-17174434
 ] 

Marcus Eagan commented on SOLR-14726:
-

bq. If we could rely on Python and felt free to ask people to install things, I 
would lean towards HTTPie instead of curl: https://httpie.org/

HTTPie is amazing but still fringe in terms of adoption. Everyone knows cURL.

I think that we should be able to point people to public datasets that are 
hosted elsewhere (maybe by one of us) rather than shipping Solr with example 
data. I'm happy to donate a public data repository for 10 years. 



> Streamline getting started experience
> -
>
> Key: SOLR-14726
> URL: https://issues.apache.org/jira/browse/SOLR-14726
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Ishan Chattopadhyaya
>Priority: Major
>  Labels: newdev
>
> The reference guide Solr tutorial is here:
> https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html
> It needs to be simplified and easy to follow. Also, it should reflect our 
> best practices, that should also be followed in production. I have following 
> suggestions:
> # Make it less verbose. It is too long. On my laptop, it required 35 page 
> downs button presses to get to the bottom of the page!
> # First step of the tutorial should be to enable security (basic auth should 
> suffice).
> # {{./bin/solr start -e cloud}} <-- All references of -e should be removed.
> # All references of {{bin/solr post}} to be replaced with {{curl}}
> # Convert all {{bin/solr create}} references to curl of collection creation 
> commands
> # Add docker based startup instructions.
> # Create a Jupyter Notebook version of the entire tutorial, make it so that 
> it can be easily executed from Google Colaboratory. Here's an example: 
> https://twitter.com/TheSearchStack/status/1289703715981496320
> # Provide downloadable Postman and Insomnia files so that the same tutorial 
> can be executed from those tools. Except for starting Solr, all other steps 
> should be possible to be carried out from those tools.
> # Use V2 APIs everywhere in the tutorial
> # Remove all example modes, sample data (films, tech products etc.), 
> configsets from Solr's distribution (instead let the examples refer to them 
> from github)
> # Remove the post tool from Solr, curl should suffice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-9451) Sort.rewrite doesn't always return this when unchanged

2020-08-10 Thread Mike Drob (Jira)
Mike Drob created LUCENE-9451:
-

 Summary: Sort.rewrite doesn't always return this when unchanged
 Key: LUCENE-9451
 URL: https://issues.apache.org/jira/browse/LUCENE-9451
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/search
Affects Versions: 8.7
Reporter: Mike Drob
Assignee: Mike Drob


Sort.rewrite doesn't always return {{this}} as advertised in the Javadoc even 
if the underlying fields are unchanged. This is because the comparison uses 
reference equality.

There are two solutions we can do here, 1) switch from reference equality to 
object equality, and 2) fix some of the underlying sort fields to not create 
unnecessary objects.

cc: [~jpountz] [~romseygeek]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14726) Streamline getting started experience

2020-08-10 Thread Alexandre Rafalovitch (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174412#comment-17174412
 ] 

Alexandre Rafalovitch commented on SOLR-14726:
--

If we could rely on Python and felt free to ask people to install things, I 
would lean towards HTTPie instead of curl: [https://httpie.org/]

 

> Streamline getting started experience
> -
>
> Key: SOLR-14726
> URL: https://issues.apache.org/jira/browse/SOLR-14726
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Ishan Chattopadhyaya
>Priority: Major
>  Labels: newdev
>
> The reference guide Solr tutorial is here:
> https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html
> It needs to be simplified and easy to follow. Also, it should reflect our 
> best practices, that should also be followed in production. I have following 
> suggestions:
> # Make it less verbose. It is too long. On my laptop, it required 35 page 
> downs button presses to get to the bottom of the page!
> # First step of the tutorial should be to enable security (basic auth should 
> suffice).
> # {{./bin/solr start -e cloud}} <-- All references of -e should be removed.
> # All references of {{bin/solr post}} to be replaced with {{curl}}
> # Convert all {{bin/solr create}} references to curl of collection creation 
> commands
> # Add docker based startup instructions.
> # Create a Jupyter Notebook version of the entire tutorial, make it so that 
> it can be easily executed from Google Colaboratory. Here's an example: 
> https://twitter.com/TheSearchStack/status/1289703715981496320
> # Provide downloadable Postman and Insomnia files so that the same tutorial 
> can be executed from those tools. Except for starting Solr, all other steps 
> should be possible to be carried out from those tools.
> # Use V2 APIs everywhere in the tutorial
> # Remove all example modes, sample data (films, tech products etc.), 
> configsets from Solr's distribution (instead let the examples refer to them 
> from github)
> # Remove the post tool from Solr, curl should suffice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] HoustonPutman merged pull request #1716: SOLR-14706: Fix support for default autoscaling policy

2020-08-10 Thread GitBox


HoustonPutman merged pull request #1716:
URL: https://github.com/apache/lucene-solr/pull/1716


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14706) Upgrading 8.6.0 to 8.6.1 causes collection creation to fail

2020-08-10 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174398#comment-17174398
 ] 

ASF subversion and git services commented on SOLR-14706:


Commit 6e11a1c3f0599f1c918bc69c4f51928d23160e99 in lucene-solr's branch 
refs/heads/branch_8_6 from Houston Putman
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=6e11a1c ]

SOLR-14706: Fix support for default autoscaling policy (#1716)



> Upgrading 8.6.0 to 8.6.1 causes collection creation to fail
> ---
>
> Key: SOLR-14706
> URL: https://issues.apache.org/jira/browse/SOLR-14706
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: AutoScaling
>Affects Versions: 8.7, 8.6.1
> Environment: 8.6.1 upgraded from 8.6.0 with more than one node
>Reporter: Gus Heck
>Assignee: Houston Putman
>Priority: Blocker
> Fix For: 8.6.1
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> The following steps will reproduce a situation in which collection creation 
> fails with this stack trace:
> {code:java}
> 2020-08-03 12:17:58.617 INFO  
> (OverseerThreadFactory-22-thread-1-processing-n:192.168.2.106:8981_solr) [   
> ] o.a.s.c.a.c.CreateCollectionCmd Create collection test861
> 2020-08-03 12:17:58.751 ERROR 
> (OverseerThreadFactory-22-thread-1-processing-n:192.168.2.106:8981_solr) [   
> ] o.a.s.c.a.c.OverseerCollectionMessageHandler Collection: test861 operation: 
> create failed:org.apache.solr.common.SolrException
>   at 
> org.apache.solr.cloud.api.collections.CreateCollectionCmd.call(CreateCollectionCmd.java:347)
>   at 
> org.apache.solr.cloud.api.collections.OverseerCollectionMessageHandler.processMessage(OverseerCollectionMessageHandler.java:264)
>   at 
> org.apache.solr.cloud.OverseerTaskProcessor$Runner.run(OverseerTaskProcessor.java:517)
>   at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:212)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: Only one extra tag supported for the 
> tag cores in {
>   "cores":"#EQUAL",
>   "node":"#ANY",
>   "strict":"false"}
>   at 
> org.apache.solr.client.solrj.cloud.autoscaling.Clause.(Clause.java:122)
>   at 
> org.apache.solr.client.solrj.cloud.autoscaling.Clause.create(Clause.java:235)
>   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>   at 
> java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1374)
>   at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
>   at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
>   at 
> java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
>   at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
>   at 
> java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
>   at 
> org.apache.solr.client.solrj.cloud.autoscaling.Policy.(Policy.java:144)
>   at 
> org.apache.solr.client.solrj.cloud.autoscaling.AutoScalingConfig.getPolicy(AutoScalingConfig.java:372)
>   at 
> org.apache.solr.cloud.api.collections.Assign.usePolicyFramework(Assign.java:300)
>   at 
> org.apache.solr.cloud.api.collections.Assign.usePolicyFramework(Assign.java:277)
>   at 
> org.apache.solr.cloud.api.collections.Assign$AssignStrategyFactory.create(Assign.java:661)
>   at 
> org.apache.solr.cloud.api.collections.CreateCollectionCmd.buildReplicaPositions(CreateCollectionCmd.java:415)
>   at 
> org.apache.solr.cloud.api.collections.CreateCollectionCmd.call(CreateCollectionCmd.java:192)
>   ... 6 more
> {code}
> Generalized steps:
> # Deploy 8.6.0 with separate data directories, create a collection to prove 
> it's working
> # download 
> https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-8.6.1-RC1-reva32a3ac4e43f629df71e5ae30a3330be94b095f2/solr/solr-8.6.1.tgz
> # Stop the server on all nodes
> # replace the 8.6.0 with 8.6.1 
> # Start the server
> # via the admin UI create a collection
> # Observe failure warning box (with no text), check logs, find above trace
> Or more exactly here are my actual commands with a checkout of the 8.6.0 tag 
> in the working dir to which cloud.sh was configured:
> # /cloud.sh new -r upgrademe 
> # Create collection named test860 via admin ui with _default
> # ./cloud.sh stop 
> # cd upgrademe/
> # cp ../8_6_1_RC1/solr-8.6.1.tgz .
> # mv solr-8.6.0-SNAPSHOT old
> # tar xzvf 

[jira] [Commented] (LUCENE-2458) queryparser makes all CJK queries phrase queries regardless of analyzer

2020-08-10 Thread Tomoko Uchida (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174394#comment-17174394
 ] 

Tomoko Uchida commented on LUCENE-2458:
---

Seems spam account? 

> queryparser makes all CJK queries phrase queries regardless of analyzer
> ---
>
> Key: LUCENE-2458
> URL: https://issues.apache.org/jira/browse/LUCENE-2458
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/queryparser
>Reporter: Robert Muir
>Assignee: Robert Muir
>Priority: Blocker
> Fix For: 3.1, 4.0-ALPHA
>
> Attachments: LUCENE-2458.patch, LUCENE-2458.patch, LUCENE-2458.patch, 
> LUCENE-2458.patch
>
>
> The queryparser automatically makes *ALL* CJK, Thai, Lao, Myanmar, Tibetan, 
> ... queries into phrase queries, even though you didn't ask for one, and 
> there isn't a way to turn this off.
> This completely breaks lucene for these languages, as it treats all queries 
> like 'grep'.
> Example: if you query for f:abcd with standardanalyzer, where a,b,c,d are 
> chinese characters, you get a phrasequery of "a b c d". if you use cjk 
> analyzer, its no better, its a phrasequery of  "ab bc cd", and if you use 
> smartchinese analyzer, you get a phrasequery like "ab cd". But the user 
> didn't ask for one, and they cannot turn it off.
> The reason is that the code to form phrase queries is not internationally 
> appropriate and assumes whitespace tokenization. If more than one token comes 
> out of whitespace delimited text, its automatically a phrase query no matter 
> what.
> The proposed patch fixes the core queryparser (with all backwards compat 
> kept) to only form phrase queries when the double quote operator is used. 
> Implementing subclasses can always extend the QP and auto-generate whatever 
> kind of queries they want that might completely break search for languages 
> they don't care about, but core general-purpose QPs should be language 
> independent.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-2458) queryparser makes all CJK queries phrase queries regardless of analyzer

2020-08-10 Thread Mr. Aleem (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174386#comment-17174386
 ] 

Mr. Aleem edited comment on LUCENE-2458 at 8/10/20, 3:26 PM:
-

Я думаю, [A Place To Download Free Software|https://piratesfile.com/]

что лучший способ продвинуться вперед - добавить поле CJK в solr, которое по 
умолчанию [onthis link|https://piratesfile.com/hitfilm-pro-crack/]имеет 
противоположное[just simply click on 
this|https://piratesfile.com/hitfilm-pro-crack]or you can press on 
linkповедение Ինչպես նկատել է Կոժին, կարծես թե [Security Cameras Best 
Software|piratesfile.com/security-monitor-pro-crack]this Hit Film Software 
will (т.е. рассматривает разделенные токены

[MAC HitFilm|https://piratesfile.com/hitfilm-pro-crack] полностью отдельные).


was (Author: maleem):
Я думаю, [A Place To Download Free Software|https://piratesfile.com]

что лучший способ продвинуться вперед - добавить поле CJK в solr, которое по 
умолчанию 

https://piratesfile.com/hitfilm-pro-crack/;>yeah this linkимеет 
противоположноеhttps://piratesfile.com/hitfilm-pro-crack/;>or you can 
press on linkповедение Ինչպես նկատել է Կոժին, կարծես թեhttps://piratesfile.com/hitfilm-pro-crack/;>this Hit Film Software 
will (т.е. рассматривает разделенные токены

https://piratesfile.com/hitfilm-pro-crack/;>как полностью 
отдельные).

> queryparser makes all CJK queries phrase queries regardless of analyzer
> ---
>
> Key: LUCENE-2458
> URL: https://issues.apache.org/jira/browse/LUCENE-2458
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/queryparser
>Reporter: Robert Muir
>Assignee: Robert Muir
>Priority: Blocker
> Fix For: 3.1, 4.0-ALPHA
>
> Attachments: LUCENE-2458.patch, LUCENE-2458.patch, LUCENE-2458.patch, 
> LUCENE-2458.patch
>
>
> The queryparser automatically makes *ALL* CJK, Thai, Lao, Myanmar, Tibetan, 
> ... queries into phrase queries, even though you didn't ask for one, and 
> there isn't a way to turn this off.
> This completely breaks lucene for these languages, as it treats all queries 
> like 'grep'.
> Example: if you query for f:abcd with standardanalyzer, where a,b,c,d are 
> chinese characters, you get a phrasequery of "a b c d". if you use cjk 
> analyzer, its no better, its a phrasequery of  "ab bc cd", and if you use 
> smartchinese analyzer, you get a phrasequery like "ab cd". But the user 
> didn't ask for one, and they cannot turn it off.
> The reason is that the code to form phrase queries is not internationally 
> appropriate and assumes whitespace tokenization. If more than one token comes 
> out of whitespace delimited text, its automatically a phrase query no matter 
> what.
> The proposed patch fixes the core queryparser (with all backwards compat 
> kept) to only form phrase queries when the double quote operator is used. 
> Implementing subclasses can always extend the QP and auto-generate whatever 
> kind of queries they want that might completely break search for languages 
> they don't care about, but core general-purpose QPs should be language 
> independent.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-2458) queryparser makes all CJK queries phrase queries regardless of analyzer

2020-08-10 Thread Mr. Aleem (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174386#comment-17174386
 ] 

Mr. Aleem edited comment on LUCENE-2458 at 8/10/20, 3:26 PM:
-

Я думаю, [A Place To Download Free Software|https://piratesfile.com/]

что лучший способ продвинуться вперед - добавить поле CJK в solr, которое по 
умолчанию [onthis link|https://piratesfile.com/hitfilm-pro-crack/]имеет 
противоположное[just simply click on 
this|https://piratesfile.com/hitfilm-pro-crack]or you can press on 
linkповедение Ինչպես նկատել է Կոժին, կարծես թե [Security Cameras Best 
Software|https://piratesfile.com/security-monitor-pro-crack]this Hit Film 
Software will (т.е. рассматривает разделенные токены

[MAC HitFilm|https://piratesfile.com/hitfilm-pro-crack] полностью отдельные).


was (Author: maleem):
Я думаю, [A Place To Download Free Software|https://piratesfile.com/]

что лучший способ продвинуться вперед - добавить поле CJK в solr, которое по 
умолчанию [onthis link|https://piratesfile.com/hitfilm-pro-crack/]имеет 
противоположное[just simply click on 
this|https://piratesfile.com/hitfilm-pro-crack]or you can press on 
linkповедение Ինչպես նկատել է Կոժին, կարծես թե [Security Cameras Best 
Software|piratesfile.com/security-monitor-pro-crack]this Hit Film Software 
will (т.е. рассматривает разделенные токены

[MAC HitFilm|https://piratesfile.com/hitfilm-pro-crack] полностью отдельные).

> queryparser makes all CJK queries phrase queries regardless of analyzer
> ---
>
> Key: LUCENE-2458
> URL: https://issues.apache.org/jira/browse/LUCENE-2458
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/queryparser
>Reporter: Robert Muir
>Assignee: Robert Muir
>Priority: Blocker
> Fix For: 3.1, 4.0-ALPHA
>
> Attachments: LUCENE-2458.patch, LUCENE-2458.patch, LUCENE-2458.patch, 
> LUCENE-2458.patch
>
>
> The queryparser automatically makes *ALL* CJK, Thai, Lao, Myanmar, Tibetan, 
> ... queries into phrase queries, even though you didn't ask for one, and 
> there isn't a way to turn this off.
> This completely breaks lucene for these languages, as it treats all queries 
> like 'grep'.
> Example: if you query for f:abcd with standardanalyzer, where a,b,c,d are 
> chinese characters, you get a phrasequery of "a b c d". if you use cjk 
> analyzer, its no better, its a phrasequery of  "ab bc cd", and if you use 
> smartchinese analyzer, you get a phrasequery like "ab cd". But the user 
> didn't ask for one, and they cannot turn it off.
> The reason is that the code to form phrase queries is not internationally 
> appropriate and assumes whitespace tokenization. If more than one token comes 
> out of whitespace delimited text, its automatically a phrase query no matter 
> what.
> The proposed patch fixes the core queryparser (with all backwards compat 
> kept) to only form phrase queries when the double quote operator is used. 
> Implementing subclasses can always extend the QP and auto-generate whatever 
> kind of queries they want that might completely break search for languages 
> they don't care about, but core general-purpose QPs should be language 
> independent.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-2458) queryparser makes all CJK queries phrase queries regardless of analyzer

2020-08-10 Thread Mr. Aleem (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174386#comment-17174386
 ] 

Mr. Aleem edited comment on LUCENE-2458 at 8/10/20, 3:19 PM:
-

Я думаю, [A Place To Download Free Software|https://piratesfile.com]

что лучший способ продвинуться вперед - добавить поле CJK в solr, которое по 
умолчанию 

https://piratesfile.com/hitfilm-pro-crack/;>yeah this linkимеет 
противоположноеhttps://piratesfile.com/hitfilm-pro-crack/;>or you can 
press on linkповедение Ինչպես նկատել է Կոժին, կարծես թեhttps://piratesfile.com/hitfilm-pro-crack/;>this Hit Film Software 
will (т.е. рассматривает разделенные токены

https://piratesfile.com/hitfilm-pro-crack/;>как полностью 
отдельные).


was (Author: maleem):
Я думаю, [#https://piratesfile.com/;>A Place to download All PC 
Software]что лучший способ продвинуться вперед - добавить поле CJK в solr, 
которое по умолчаниюhttps://piratesfile.com/hitfilm-pro-crack/;>yeah 
this linkимеет противоположноеhttps://piratesfile.com/hitfilm-pro-crack/;>or you can press on 
linkповедение Ինչպես նկատել է Կոժին, կարծես թեhttps://piratesfile.com/hitfilm-pro-crack/;>this Hit Film Software 
will (т.е. рассматривает разделенные токены

https://piratesfile.com/hitfilm-pro-crack/;>как полностью 
отдельные).

> queryparser makes all CJK queries phrase queries regardless of analyzer
> ---
>
> Key: LUCENE-2458
> URL: https://issues.apache.org/jira/browse/LUCENE-2458
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/queryparser
>Reporter: Robert Muir
>Assignee: Robert Muir
>Priority: Blocker
> Fix For: 3.1, 4.0-ALPHA
>
> Attachments: LUCENE-2458.patch, LUCENE-2458.patch, LUCENE-2458.patch, 
> LUCENE-2458.patch
>
>
> The queryparser automatically makes *ALL* CJK, Thai, Lao, Myanmar, Tibetan, 
> ... queries into phrase queries, even though you didn't ask for one, and 
> there isn't a way to turn this off.
> This completely breaks lucene for these languages, as it treats all queries 
> like 'grep'.
> Example: if you query for f:abcd with standardanalyzer, where a,b,c,d are 
> chinese characters, you get a phrasequery of "a b c d". if you use cjk 
> analyzer, its no better, its a phrasequery of  "ab bc cd", and if you use 
> smartchinese analyzer, you get a phrasequery like "ab cd". But the user 
> didn't ask for one, and they cannot turn it off.
> The reason is that the code to form phrase queries is not internationally 
> appropriate and assumes whitespace tokenization. If more than one token comes 
> out of whitespace delimited text, its automatically a phrase query no matter 
> what.
> The proposed patch fixes the core queryparser (with all backwards compat 
> kept) to only form phrase queries when the double quote operator is used. 
> Implementing subclasses can always extend the QP and auto-generate whatever 
> kind of queries they want that might completely break search for languages 
> they don't care about, but core general-purpose QPs should be language 
> independent.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-2458) queryparser makes all CJK queries phrase queries regardless of analyzer

2020-08-10 Thread Mr. Aleem (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174386#comment-17174386
 ] 

Mr. Aleem edited comment on LUCENE-2458 at 8/10/20, 3:17 PM:
-

Я думаю, [#https://piratesfile.com/;>A Place to download All PC 
Software]что лучший способ продвинуться вперед - добавить поле CJK в solr, 
которое по умолчаниюhttps://piratesfile.com/hitfilm-pro-crack/;>yeah 
this linkимеет противоположноеhttps://piratesfile.com/hitfilm-pro-crack/;>or you can press on 
linkповедение Ինչպես նկատել է Կոժին, կարծես թեhttps://piratesfile.com/hitfilm-pro-crack/;>this Hit Film Software 
will (т.е. рассматривает разделенные токены

https://piratesfile.com/hitfilm-pro-crack/;>как полностью 
отдельные).


was (Author: maleem):
Я думаю, https://piratesfile.com/;>A Place to download All PC 
Softwareчто лучший способ продвинуться вперед - добавить поле CJK в solr, 
которое по умолчаниюhttps://piratesfile.com/hitfilm-pro-crack/;>yeah 
this linkимеет противоположноеhttps://piratesfile.com/hitfilm-pro-crack/;>or you can press on 
linkповедение Ինչպես նկատել է Կոժին, կարծես թեhttps://piratesfile.com/hitfilm-pro-crack/;>this Hit Film Software 
will (т.е. рассматривает разделенные токены

https://piratesfile.com/hitfilm-pro-crack/;>как полностью 
отдельные).

> queryparser makes all CJK queries phrase queries regardless of analyzer
> ---
>
> Key: LUCENE-2458
> URL: https://issues.apache.org/jira/browse/LUCENE-2458
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/queryparser
>Reporter: Robert Muir
>Assignee: Robert Muir
>Priority: Blocker
> Fix For: 3.1, 4.0-ALPHA
>
> Attachments: LUCENE-2458.patch, LUCENE-2458.patch, LUCENE-2458.patch, 
> LUCENE-2458.patch
>
>
> The queryparser automatically makes *ALL* CJK, Thai, Lao, Myanmar, Tibetan, 
> ... queries into phrase queries, even though you didn't ask for one, and 
> there isn't a way to turn this off.
> This completely breaks lucene for these languages, as it treats all queries 
> like 'grep'.
> Example: if you query for f:abcd with standardanalyzer, where a,b,c,d are 
> chinese characters, you get a phrasequery of "a b c d". if you use cjk 
> analyzer, its no better, its a phrasequery of  "ab bc cd", and if you use 
> smartchinese analyzer, you get a phrasequery like "ab cd". But the user 
> didn't ask for one, and they cannot turn it off.
> The reason is that the code to form phrase queries is not internationally 
> appropriate and assumes whitespace tokenization. If more than one token comes 
> out of whitespace delimited text, its automatically a phrase query no matter 
> what.
> The proposed patch fixes the core queryparser (with all backwards compat 
> kept) to only form phrase queries when the double quote operator is used. 
> Implementing subclasses can always extend the QP and auto-generate whatever 
> kind of queries they want that might completely break search for languages 
> they don't care about, but core general-purpose QPs should be language 
> independent.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2458) queryparser makes all CJK queries phrase queries regardless of analyzer

2020-08-10 Thread Mr. Aleem (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174386#comment-17174386
 ] 

Mr. Aleem commented on LUCENE-2458:
---

Я думаю, https://piratesfile.com/;>A Place to download All PC 
Softwareчто лучший способ продвинуться вперед - добавить поле CJK в solr, 
которое по умолчаниюhttps://piratesfile.com/hitfilm-pro-crack/;>yeah 
this linkимеет противоположноеhttps://piratesfile.com/hitfilm-pro-crack/;>or you can press on 
linkповедение Ինչպես նկատել է Կոժին, կարծես թեhttps://piratesfile.com/hitfilm-pro-crack/;>this Hit Film Software 
will (т.е. рассматривает разделенные токены

https://piratesfile.com/hitfilm-pro-crack/;>как полностью 
отдельные).

> queryparser makes all CJK queries phrase queries regardless of analyzer
> ---
>
> Key: LUCENE-2458
> URL: https://issues.apache.org/jira/browse/LUCENE-2458
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/queryparser
>Reporter: Robert Muir
>Assignee: Robert Muir
>Priority: Blocker
> Fix For: 3.1, 4.0-ALPHA
>
> Attachments: LUCENE-2458.patch, LUCENE-2458.patch, LUCENE-2458.patch, 
> LUCENE-2458.patch
>
>
> The queryparser automatically makes *ALL* CJK, Thai, Lao, Myanmar, Tibetan, 
> ... queries into phrase queries, even though you didn't ask for one, and 
> there isn't a way to turn this off.
> This completely breaks lucene for these languages, as it treats all queries 
> like 'grep'.
> Example: if you query for f:abcd with standardanalyzer, where a,b,c,d are 
> chinese characters, you get a phrasequery of "a b c d". if you use cjk 
> analyzer, its no better, its a phrasequery of  "ab bc cd", and if you use 
> smartchinese analyzer, you get a phrasequery like "ab cd". But the user 
> didn't ask for one, and they cannot turn it off.
> The reason is that the code to form phrase queries is not internationally 
> appropriate and assumes whitespace tokenization. If more than one token comes 
> out of whitespace delimited text, its automatically a phrase query no matter 
> what.
> The proposed patch fixes the core queryparser (with all backwards compat 
> kept) to only form phrase queries when the double quote operator is used. 
> Implementing subclasses can always extend the QP and auto-generate whatever 
> kind of queries they want that might completely break search for languages 
> they don't care about, but core general-purpose QPs should be language 
> independent.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Issue Comment Deleted] (LUCENE-2458) queryparser makes all CJK queries phrase queries regardless of analyzer

2020-08-10 Thread Mr. Aleem (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mr. Aleem updated LUCENE-2458:
--
Comment: was deleted

(was: Я думаю, https://piratesfile.com/;>A Place to download All PC 
Softwareчто лучший способ продвинуться вперед - добавить поле CJK в solr, 
которое по умолчаниюhttps://piratesfile.com/hitfilm-pro-crack/;>yeah 
this linkимеет противоположноеhttps://piratesfile.com/hitfilm-pro-crack/;>or you can press on 
linkповедение Ինչպես նկատել է Կոժին, կարծես թեhttps://piratesfile.com/hitfilm-pro-crack/;>this Hit Film Software 
will (т.е. рассматривает разделенные токены

https://piratesfile.com/hitfilm-pro-crack/;>как полностью 
отдельные).)

> queryparser makes all CJK queries phrase queries regardless of analyzer
> ---
>
> Key: LUCENE-2458
> URL: https://issues.apache.org/jira/browse/LUCENE-2458
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/queryparser
>Reporter: Robert Muir
>Assignee: Robert Muir
>Priority: Blocker
> Fix For: 3.1, 4.0-ALPHA
>
> Attachments: LUCENE-2458.patch, LUCENE-2458.patch, LUCENE-2458.patch, 
> LUCENE-2458.patch
>
>
> The queryparser automatically makes *ALL* CJK, Thai, Lao, Myanmar, Tibetan, 
> ... queries into phrase queries, even though you didn't ask for one, and 
> there isn't a way to turn this off.
> This completely breaks lucene for these languages, as it treats all queries 
> like 'grep'.
> Example: if you query for f:abcd with standardanalyzer, where a,b,c,d are 
> chinese characters, you get a phrasequery of "a b c d". if you use cjk 
> analyzer, its no better, its a phrasequery of  "ab bc cd", and if you use 
> smartchinese analyzer, you get a phrasequery like "ab cd". But the user 
> didn't ask for one, and they cannot turn it off.
> The reason is that the code to form phrase queries is not internationally 
> appropriate and assumes whitespace tokenization. If more than one token comes 
> out of whitespace delimited text, its automatically a phrase query no matter 
> what.
> The proposed patch fixes the core queryparser (with all backwards compat 
> kept) to only form phrase queries when the double quote operator is used. 
> Implementing subclasses can always extend the QP and auto-generate whatever 
> kind of queries they want that might completely break search for languages 
> they don't care about, but core general-purpose QPs should be language 
> independent.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-2458) queryparser makes all CJK queries phrase queries regardless of analyzer

2020-08-10 Thread Mr. Aleem (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174385#comment-17174385
 ] 

Mr. Aleem edited comment on LUCENE-2458 at 8/10/20, 3:15 PM:
-

Я думаю, https://piratesfile.com/;>A Place to download All PC 
Softwareчто лучший способ продвинуться вперед - добавить поле CJK в solr, 
которое по умолчаниюhttps://piratesfile.com/hitfilm-pro-crack/;>yeah 
this linkимеет противоположноеhttps://piratesfile.com/hitfilm-pro-crack/;>or you can press on 
linkповедение Ինչպես նկատել է Կոժին, կարծես թեhttps://piratesfile.com/hitfilm-pro-crack/;>this Hit Film Software 
will (т.е. рассматривает разделенные токены

https://piratesfile.com/hitfilm-pro-crack/;>как полностью 
отдельные).


was (Author: maleem):
Я думаю, https://piratesfile.com/;>A Place to download All PC 
Softwareчто лучший способ продвинуться вперед - добавить поле CJK в solr, 
которое по умолчаниюhttps://piratesfile.com/hitfilm-pro-crack/;>yeah 
this linkимеет противоположноеhttps://piratesfile.com/hitfilm-pro-crack/;>or you can press on 
linkповедение Ինչպես նկատել է Կոժին, կարծես թեhttps://piratesfile.com/hitfilm-pro-crack/;>this Hit Film Software 
will (т.е. рассматривает разделенные токены

https://piratesfile.com/hitfilm-pro-crack/;>как полностью 
отдельные).

> queryparser makes all CJK queries phrase queries regardless of analyzer
> ---
>
> Key: LUCENE-2458
> URL: https://issues.apache.org/jira/browse/LUCENE-2458
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/queryparser
>Reporter: Robert Muir
>Assignee: Robert Muir
>Priority: Blocker
> Fix For: 3.1, 4.0-ALPHA
>
> Attachments: LUCENE-2458.patch, LUCENE-2458.patch, LUCENE-2458.patch, 
> LUCENE-2458.patch
>
>
> The queryparser automatically makes *ALL* CJK, Thai, Lao, Myanmar, Tibetan, 
> ... queries into phrase queries, even though you didn't ask for one, and 
> there isn't a way to turn this off.
> This completely breaks lucene for these languages, as it treats all queries 
> like 'grep'.
> Example: if you query for f:abcd with standardanalyzer, where a,b,c,d are 
> chinese characters, you get a phrasequery of "a b c d". if you use cjk 
> analyzer, its no better, its a phrasequery of  "ab bc cd", and if you use 
> smartchinese analyzer, you get a phrasequery like "ab cd". But the user 
> didn't ask for one, and they cannot turn it off.
> The reason is that the code to form phrase queries is not internationally 
> appropriate and assumes whitespace tokenization. If more than one token comes 
> out of whitespace delimited text, its automatically a phrase query no matter 
> what.
> The proposed patch fixes the core queryparser (with all backwards compat 
> kept) to only form phrase queries when the double quote operator is used. 
> Implementing subclasses can always extend the QP and auto-generate whatever 
> kind of queries they want that might completely break search for languages 
> they don't care about, but core general-purpose QPs should be language 
> independent.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2458) queryparser makes all CJK queries phrase queries regardless of analyzer

2020-08-10 Thread Mr. Aleem (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174385#comment-17174385
 ] 

Mr. Aleem commented on LUCENE-2458:
---

Я думаю, https://piratesfile.com/;>A Place to download All PC 
Softwareчто лучший способ продвинуться вперед - добавить поле CJK в solr, 
которое по умолчаниюhttps://piratesfile.com/hitfilm-pro-crack/;>yeah 
this linkимеет противоположноеhttps://piratesfile.com/hitfilm-pro-crack/;>or you can press on 
linkповедение Ինչպես նկատել է Կոժին, կարծես թեhttps://piratesfile.com/hitfilm-pro-crack/;>this Hit Film Software 
will (т.е. рассматривает разделенные токены

https://piratesfile.com/hitfilm-pro-crack/;>как полностью 
отдельные).

> queryparser makes all CJK queries phrase queries regardless of analyzer
> ---
>
> Key: LUCENE-2458
> URL: https://issues.apache.org/jira/browse/LUCENE-2458
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/queryparser
>Reporter: Robert Muir
>Assignee: Robert Muir
>Priority: Blocker
> Fix For: 3.1, 4.0-ALPHA
>
> Attachments: LUCENE-2458.patch, LUCENE-2458.patch, LUCENE-2458.patch, 
> LUCENE-2458.patch
>
>
> The queryparser automatically makes *ALL* CJK, Thai, Lao, Myanmar, Tibetan, 
> ... queries into phrase queries, even though you didn't ask for one, and 
> there isn't a way to turn this off.
> This completely breaks lucene for these languages, as it treats all queries 
> like 'grep'.
> Example: if you query for f:abcd with standardanalyzer, where a,b,c,d are 
> chinese characters, you get a phrasequery of "a b c d". if you use cjk 
> analyzer, its no better, its a phrasequery of  "ab bc cd", and if you use 
> smartchinese analyzer, you get a phrasequery like "ab cd". But the user 
> didn't ask for one, and they cannot turn it off.
> The reason is that the code to form phrase queries is not internationally 
> appropriate and assumes whitespace tokenization. If more than one token comes 
> out of whitespace delimited text, its automatically a phrase query no matter 
> what.
> The proposed patch fixes the core queryparser (with all backwards compat 
> kept) to only form phrase queries when the double quote operator is used. 
> Implementing subclasses can always extend the QP and auto-generate whatever 
> kind of queries they want that might completely break search for languages 
> they don't care about, but core general-purpose QPs should be language 
> independent.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] murblanc commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for placement plugin interface

2020-08-10 Thread GitBox


murblanc commented on a change in pull request #1684:
URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r467947455



##
File path: 
solr/core/src/java/org/apache/solr/cluster/placement/PlacementPlanFactory.java
##
@@ -0,0 +1,52 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.cluster.placement;
+
+import java.util.Set;
+
+/**
+ * Allows plugins to create {@link PlacementPlan}s telling the Solr layer 
where to create replicas following the processing of
+ * a {@link PlacementRequest}. The Solr layer can (and will) check that the 
{@link PlacementPlan} conforms to the {@link PlacementRequest} (and
+ * if it does not, the requested operation will fail).
+ */
+public interface PlacementPlanFactory {
+  /**
+   * Creates a {@link PlacementPlan} for adding a new collection and its 
replicas.
+   *
+   * This is in support of {@link 
org.apache.solr.cloud.api.collections.CreateCollectionCmd}.
+   */
+  PlacementPlan 
createPlacementPlanNewCollection(CreateNewCollectionPlacementRequest request, 
String CollectionName, Set replicaPlacements);
+
+  /**
+   * Creates a {@link PlacementPlan} for adding replicas to a given shard 
of an existing collection.
+   *
+   * This is in support (directly or indirectly) of {@link 
org.apache.solr.cloud.api.collections.AddReplicaCmd},
+   * {@link org.apache.solr.cloud.api.collections.CreateShardCmd}, {@link 
org.apache.solr.cloud.api.collections.ReplaceNodeCmd},
+   * {@link org.apache.solr.cloud.api.collections.MoveReplicaCmd}, {@link 
org.apache.solr.cloud.api.collections.SplitShardCmd},
+   * {@link org.apache.solr.cloud.api.collections.RestoreCmd} and {@link 
org.apache.solr.cloud.api.collections.MigrateCmd}.
+   * (as well as of {@link 
org.apache.solr.cloud.api.collections.CreateCollectionCmd} in the specific case 
of
+   * {@link 
org.apache.solr.common.params.CollectionAdminParams#WITH_COLLECTION} but this 
should be removed shortly and
+   * the section in parentheses of this comment should be removed when the 
{@code withCollection} javadoc link appears broken).
+   */
+  PlacementPlan createPlacementPlanAddReplicas(AddReplicasPlacementRequest 
request, String CollectionName, Set replicaPlacements);
+
+  /**
+   * Creates a {@link ReplicaPlacement} needed to be passed to some/all {@link 
PlacementPlan} factory methods.
+   */
+  ReplicaPlacement createReplicaPlacement(String shardName, Node node, 
Replica.ReplicaType replicaType);

Review comment:
   That’s the way for the plugin to build the replica placements it has 
decided in order to pass the set to the appropriate PlacementPlanFactory method.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] murblanc commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for placement plugin interface

2020-08-10 Thread GitBox


murblanc commented on a change in pull request #1684:
URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r467945960



##
File path: 
solr/core/src/java/org/apache/solr/cluster/placement/PlacementPlanFactory.java
##
@@ -0,0 +1,52 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.cluster.placement;
+
+import java.util.Set;
+
+/**
+ * Allows plugins to create {@link PlacementPlan}s telling the Solr layer 
where to create replicas following the processing of
+ * a {@link PlacementRequest}. The Solr layer can (and will) check that the 
{@link PlacementPlan} conforms to the {@link PlacementRequest} (and
+ * if it does not, the requested operation will fail).
+ */
+public interface PlacementPlanFactory {
+  /**
+   * Creates a {@link PlacementPlan} for adding a new collection and its 
replicas.
+   *
+   * This is in support of {@link 
org.apache.solr.cloud.api.collections.CreateCollectionCmd}.
+   */
+  PlacementPlan 
createPlacementPlanNewCollection(CreateNewCollectionPlacementRequest request, 
String CollectionName, Set replicaPlacements);
+
+  /**
+   * Creates a {@link PlacementPlan} for adding replicas to a given shard 
of an existing collection.
+   *
+   * This is in support (directly or indirectly) of {@link 
org.apache.solr.cloud.api.collections.AddReplicaCmd},
+   * {@link org.apache.solr.cloud.api.collections.CreateShardCmd}, {@link 
org.apache.solr.cloud.api.collections.ReplaceNodeCmd},
+   * {@link org.apache.solr.cloud.api.collections.MoveReplicaCmd}, {@link 
org.apache.solr.cloud.api.collections.SplitShardCmd},
+   * {@link org.apache.solr.cloud.api.collections.RestoreCmd} and {@link 
org.apache.solr.cloud.api.collections.MigrateCmd}.
+   * (as well as of {@link 
org.apache.solr.cloud.api.collections.CreateCollectionCmd} in the specific case 
of
+   * {@link 
org.apache.solr.common.params.CollectionAdminParams#WITH_COLLECTION} but this 
should be removed shortly and
+   * the section in parentheses of this comment should be removed when the 
{@code withCollection} javadoc link appears broken).
+   */
+  PlacementPlan createPlacementPlanAddReplicas(AddReplicasPlacementRequest 
request, String CollectionName, Set replicaPlacements);

Review comment:
   Is the move replica command picking up the destination or is the 
destination specified in the API call? If the latter, there will be no call to 
the placement plugin.
   And if the former, the fact that no files are to be moved is relatively 
transparent to the plugin. The plugin doesn’t do any work but just tells Solr 
where to put things. Solr code would then either create or move depending on 
what command it was executing.
   The only difference could be move placement computation (if there’s one) 
should take into account the lower load on the source node (since replicas will 
be moved off of it).





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] murblanc commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for placement plugin interface

2020-08-10 Thread GitBox


murblanc commented on a change in pull request #1684:
URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r467943083



##
File path: solr/core/src/java/org/apache/solr/cluster/placement/Cluster.java
##
@@ -0,0 +1,53 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.cluster.placement;
+
+import java.io.IOException;
+import java.util.Optional;
+import java.util.Set;
+
+/**
+ * A representation of the (initial) cluster state, providing information 
on which nodes are part of the cluster and a way
+ * to get to more detailed info.
+ *
+ * This instance can also be used as a {@link PropertyValueSource} if 
{@link PropertyKey}'s need to be specified with
+ * a global cluster target.
+ */
+public interface Cluster extends PropertyValueSource {
+  /**
+   * @return current set of live nodes. Never null, never empty 
(Solr wouldn't call the plugin if empty
+   * since no useful work could then be done).
+   */
+  Set getLiveNodes();
+
+  /**
+   * Returns info about the given collection if one exists. Because it is 
not expected for plugins to request info about
+   * a large number of collections, requests can only be made one by one.
+   *
+   * This is also the reason we do not return a {@link java.util.Map} or 
{@link Set} of {@link SolrCollection}'s here: it would be
+   * wasteful to fetch all data and fill such a map when plugin code likely 
needs info about at most one or two collections.
+   */
+  Optional getCollection(String collectionName) throws 
IOException;
+
+  /**
+   * Allows getting all {@link SolrCollection} present in the cluster.
+   *
+   * WARNING: this call might be extremely inefficient on large 
clusters. Usage is discouraged.
+   */
+  Set getAllCollections();

Review comment:
   I think there was no call to get only names (away from computer for a 
week or more). The only call on cluster state returned DocCollection set or 
map...





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9450) Taxonomy index should use DocValues not StoredFields

2020-08-10 Thread Michael McCandless (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174343#comment-17174343
 ] 

Michael McCandless commented on LUCENE-9450:


+1, thanks [~gworah]!  It is really silly that the taxonomy index uses stored 
fields today and must do a number of stored field lookups for each query to 
resolve taxonomy ordinals back to human presentable facet labels.

At search time, after pulling the {{BinaryDocValues}}, you need to 
{{.advanceExact}} to that docid, confirm (maybe, {{assert}}?) that method 
returns {{true}}, then pull the {{.binaryValue()}}.

Did you see an exception in tests when you tried your patch?  The default 
{{Codec}} should throw an exception if you try to pull a {{.binaryValue()}} 
without first calling {{.advancExact()}} I hope.

Also, at indexing time, it looks like you are no longer indexing the 
{{StringField}}, but I think you must keep indexing it, but change the 
{{Field.Store.YES}} to {{Field.Store.NO}}.  This field is also stored in the 
inverted index and is what allows us to do the label -> ordinal lookup, I think.

Maybe post some of the failing tests if those two above fixes still don't work? 
 Thanks for tackling this!

> Taxonomy index should use DocValues not StoredFields
> 
>
> Key: LUCENE-9450
> URL: https://issues.apache.org/jira/browse/LUCENE-9450
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/facet
>Affects Versions: 8.5.2
>Reporter: Gautam Worah
>Priority: Minor
>  Labels: performance
> Attachments: wip_taxonomy_patch
>
>
> The taxonomy index that maps binning labels to ordinals was created before 
> Lucene added BinaryDocValues.
> I've attached a WIP patch (does not pass tests currently)
> Issue suggested by [~mikemccand]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13381) Unexpected docvalues type SORTED_NUMERIC Exception when grouping by a PointField facet

2020-08-10 Thread Cassandra Targett (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174336#comment-17174336
 ] 

Cassandra Targett commented on SOLR-13381:
--

The exact strategies your addition mentions in passing are discussed in the 
section on reindexing strategies, 
https://lucene.apache.org/solr/guide/8_6/reindexing.html#reindexing-strategies, 
in greater detail. My feeling is this note is repetition that doesn't add value.

I also don't like that it effectively becomes the 2nd sentence on the page, and 
it's overly general. Reindexing doesn't always require removing all documents 
first. Sometimes you _can_ just update the existing documents, but the page 
says a couple times that dropping the index is the preferred approach. The only 
thing I could think of adding, really, is to specifically add a sentence to the 
section on reindexing strategies that repeats the point that if you change 
field types, you must reindex *from scratch*.

The folks who read the page and didn't understand these from the section on 
reindexing, could you perhaps share your thoughts on how we could have been 
more clear? I'm just skeptical that this one sentence at the top of the page is 
going to bring that point home if all the other discussion about it didn't.

> Unexpected docvalues type SORTED_NUMERIC Exception when grouping by a 
> PointField facet
> --
>
> Key: SOLR-13381
> URL: https://issues.apache.org/jira/browse/SOLR-13381
> Project: Solr
>  Issue Type: Bug
>  Components: faceting
>Affects Versions: 7.0, 7.6, 7.7, 7.7.1
> Environment: solr, solrcloud
>Reporter: Zhu JiaJun
>Assignee: Erick Erickson
>Priority: Major
> Attachments: SOLR-13381.patch, SOLR-13381.patch
>
>
> Hey,
> I got an "Unexpected docvalues type SORTED_NUMERIC" exception when I perform 
> group facet on an IntPointField. Debugging into the source code, the cause is 
> that internally the docvalue type for PointField is "NUMERIC" (single value) 
> or "SORTED_NUMERIC" (multi value), while the TermGroupFacetCollector class 
> requires the facet field must have a "SORTED" or "SOTRTED_SET" docvalue type: 
> [https://github.com/apache/lucene-solr/blob/2480b74887eff01f729d62a57b415d772f947c91/lucene/grouping/src/java/org/apache/lucene/search/grouping/TermGroupFacetCollector.java#L313]
>  
> When I change schema for all int field to TrieIntField, the group facet then 
> work. Since internally the docvalue type for TrieField is SORTED (single 
> value) or SORTED_SET (multi value).
> Regarding that the "TrieField" is depreciated in Solr7, please help on this 
> grouping facet issue for PointField. I also commented this issue in SOLR-7495.
>  
> In addtional, all place of "${solr.tests.IntegerFieldType}" in the unit test 
> files seems to be using the "TrieintField", if change to "IntPointField", 
> some unit tests will fail, for example: 
> [https://github.com/apache/lucene-solr/blob/3de0b3671998cc9bc723d10f1b31ce48cbd4fa64/solr/core/src/test/org/apache/solr/request/SimpleFacetsTest.java#L417]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-13381) Unexpected docvalues type SORTED_NUMERIC Exception when grouping by a PointField facet

2020-08-10 Thread Erick Erickson (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-13381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated SOLR-13381:
--
Attachment: SOLR-13381.patch
Status: Reopened  (was: Reopened)

[~ctargett] (or anyone) WDYT about the wordsmithing here? I verified that 
deleting all docs and committing actually does remove all the segments. I 
didn't want to get in to a long explanation about segments, so I just left the 
rather cryptic comment on segments.

> Unexpected docvalues type SORTED_NUMERIC Exception when grouping by a 
> PointField facet
> --
>
> Key: SOLR-13381
> URL: https://issues.apache.org/jira/browse/SOLR-13381
> Project: Solr
>  Issue Type: Bug
>  Components: faceting
>Affects Versions: 7.7.1, 7.7, 7.6, 7.0
> Environment: solr, solrcloud
>Reporter: Zhu JiaJun
>Priority: Major
> Attachments: SOLR-13381.patch, SOLR-13381.patch
>
>
> Hey,
> I got an "Unexpected docvalues type SORTED_NUMERIC" exception when I perform 
> group facet on an IntPointField. Debugging into the source code, the cause is 
> that internally the docvalue type for PointField is "NUMERIC" (single value) 
> or "SORTED_NUMERIC" (multi value), while the TermGroupFacetCollector class 
> requires the facet field must have a "SORTED" or "SOTRTED_SET" docvalue type: 
> [https://github.com/apache/lucene-solr/blob/2480b74887eff01f729d62a57b415d772f947c91/lucene/grouping/src/java/org/apache/lucene/search/grouping/TermGroupFacetCollector.java#L313]
>  
> When I change schema for all int field to TrieIntField, the group facet then 
> work. Since internally the docvalue type for TrieField is SORTED (single 
> value) or SORTED_SET (multi value).
> Regarding that the "TrieField" is depreciated in Solr7, please help on this 
> grouping facet issue for PointField. I also commented this issue in SOLR-7495.
>  
> In addtional, all place of "${solr.tests.IntegerFieldType}" in the unit test 
> files seems to be using the "TrieintField", if change to "IntPointField", 
> some unit tests will fail, for example: 
> [https://github.com/apache/lucene-solr/blob/3de0b3671998cc9bc723d10f1b31ce48cbd4fa64/solr/core/src/test/org/apache/solr/request/SimpleFacetsTest.java#L417]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Assigned] (SOLR-13381) Unexpected docvalues type SORTED_NUMERIC Exception when grouping by a PointField facet

2020-08-10 Thread Erick Erickson (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-13381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson reassigned SOLR-13381:
-

Assignee: Erick Erickson

> Unexpected docvalues type SORTED_NUMERIC Exception when grouping by a 
> PointField facet
> --
>
> Key: SOLR-13381
> URL: https://issues.apache.org/jira/browse/SOLR-13381
> Project: Solr
>  Issue Type: Bug
>  Components: faceting
>Affects Versions: 7.0, 7.6, 7.7, 7.7.1
> Environment: solr, solrcloud
>Reporter: Zhu JiaJun
>Assignee: Erick Erickson
>Priority: Major
> Attachments: SOLR-13381.patch, SOLR-13381.patch
>
>
> Hey,
> I got an "Unexpected docvalues type SORTED_NUMERIC" exception when I perform 
> group facet on an IntPointField. Debugging into the source code, the cause is 
> that internally the docvalue type for PointField is "NUMERIC" (single value) 
> or "SORTED_NUMERIC" (multi value), while the TermGroupFacetCollector class 
> requires the facet field must have a "SORTED" or "SOTRTED_SET" docvalue type: 
> [https://github.com/apache/lucene-solr/blob/2480b74887eff01f729d62a57b415d772f947c91/lucene/grouping/src/java/org/apache/lucene/search/grouping/TermGroupFacetCollector.java#L313]
>  
> When I change schema for all int field to TrieIntField, the group facet then 
> work. Since internally the docvalue type for TrieField is SORTED (single 
> value) or SORTED_SET (multi value).
> Regarding that the "TrieField" is depreciated in Solr7, please help on this 
> grouping facet issue for PointField. I also commented this issue in SOLR-7495.
>  
> In addtional, all place of "${solr.tests.IntegerFieldType}" in the unit test 
> files seems to be using the "TrieintField", if change to "IntPointField", 
> some unit tests will fail, for example: 
> [https://github.com/apache/lucene-solr/blob/3de0b3671998cc9bc723d10f1b31ce48cbd4fa64/solr/core/src/test/org/apache/solr/request/SimpleFacetsTest.java#L417]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Reopened] (SOLR-13381) Unexpected docvalues type SORTED_NUMERIC Exception when grouping by a PointField facet

2020-08-10 Thread Erick Erickson (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-13381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson reopened SOLR-13381:
---

Good point, I'll add a note to the docs.

> Unexpected docvalues type SORTED_NUMERIC Exception when grouping by a 
> PointField facet
> --
>
> Key: SOLR-13381
> URL: https://issues.apache.org/jira/browse/SOLR-13381
> Project: Solr
>  Issue Type: Bug
>  Components: faceting
>Affects Versions: 7.0, 7.6, 7.7, 7.7.1
> Environment: solr, solrcloud
>Reporter: Zhu JiaJun
>Priority: Major
> Attachments: SOLR-13381.patch
>
>
> Hey,
> I got an "Unexpected docvalues type SORTED_NUMERIC" exception when I perform 
> group facet on an IntPointField. Debugging into the source code, the cause is 
> that internally the docvalue type for PointField is "NUMERIC" (single value) 
> or "SORTED_NUMERIC" (multi value), while the TermGroupFacetCollector class 
> requires the facet field must have a "SORTED" or "SOTRTED_SET" docvalue type: 
> [https://github.com/apache/lucene-solr/blob/2480b74887eff01f729d62a57b415d772f947c91/lucene/grouping/src/java/org/apache/lucene/search/grouping/TermGroupFacetCollector.java#L313]
>  
> When I change schema for all int field to TrieIntField, the group facet then 
> work. Since internally the docvalue type for TrieField is SORTED (single 
> value) or SORTED_SET (multi value).
> Regarding that the "TrieField" is depreciated in Solr7, please help on this 
> grouping facet issue for PointField. I also commented this issue in SOLR-7495.
>  
> In addtional, all place of "${solr.tests.IntegerFieldType}" in the unit test 
> files seems to be using the "TrieintField", if change to "IntPointField", 
> some unit tests will fail, for example: 
> [https://github.com/apache/lucene-solr/blob/3de0b3671998cc9bc723d10f1b31ce48cbd4fa64/solr/core/src/test/org/apache/solr/request/SimpleFacetsTest.java#L417]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14691) Metrics reporting should avoid creating objects

2020-08-10 Thread Noble Paul (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-14691:
--
Priority: Blocker  (was: Major)

> Metrics reporting should avoid creating objects
> ---
>
> Key: SOLR-14691
> URL: https://issues.apache.org/jira/browse/SOLR-14691
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Reporter: Andrzej Bialecki
>Priority: Blocker
> Fix For: 8.7
>
>
> {{MetricUtils}} unnecessarily creates a lot of short-lived objects (maps and 
> lists). This affects GC, especially since metrics are frequently polled by 
> clients. We should refactor it to use {{MapWriter}} as much as possible.
> Alternatively we could provide our wrappers or subclasses of Codahale metrics 
> that implement {{MapWriter}}, then a lot of complexity in {{MetricUtils}} 
> wouldn't be needed at all.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14691) Metrics reporting should avoid creating objects

2020-08-10 Thread Noble Paul (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-14691:
--
Fix Version/s: 8.7

> Metrics reporting should avoid creating objects
> ---
>
> Key: SOLR-14691
> URL: https://issues.apache.org/jira/browse/SOLR-14691
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: metrics
>Reporter: Andrzej Bialecki
>Priority: Major
> Fix For: 8.7
>
>
> {{MetricUtils}} unnecessarily creates a lot of short-lived objects (maps and 
> lists). This affects GC, especially since metrics are frequently polled by 
> clients. We should refactor it to use {{MapWriter}} as much as possible.
> Alternatively we could provide our wrappers or subclasses of Codahale metrics 
> that implement {{MapWriter}}, then a lot of complexity in {{MetricUtils}} 
> wouldn't be needed at all.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] noblepaul commented on pull request #1694: SOLR-14680: Provide simple interfaces to our cloud classes (only API)

2020-08-10 Thread GitBox


noblepaul commented on pull request #1694:
URL: https://github.com/apache/lucene-solr/pull/1694#issuecomment-671305266


   I intend to merge this soon



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] noblepaul commented on a change in pull request #1694: SOLR-14680: Provide simple interfaces to our cloud classes (only API)

2020-08-10 Thread GitBox


noblepaul commented on a change in pull request #1694:
URL: https://github.com/apache/lucene-solr/pull/1694#discussion_r467845512



##
File path: solr/solrj/src/java/org/apache/solr/cluster/api/SolrNode.java
##
@@ -0,0 +1,39 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.cluster.api;
+
+import org.apache.solr.common.util.SimpleMap;
+
+/** A read only view of a Solr node */
+public interface SolrNode {
+
+  /** The node name */
+  String name();
+
+  /**Base http url for this node
+   *
+   * @param isV2 if true gives the /api endpoint , else /solr endpoint
+   */
+  String baseUrl(boolean isV2);

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] s1monw commented on pull request #1623: LUCENE-8962: Merge segments on getReader

2020-08-10 Thread GitBox


s1monw commented on pull request #1623:
URL: https://github.com/apache/lucene-solr/pull/1623#issuecomment-671301636


   @mikemccand I do understand the issue now why holding the _flushLock_ is 
illegal here. The problem is again the lock ordering in combination with the 
_commitLock_. One option that we have here is to remove the _flushLock_ 
altogether and replace it's usage with the _commitLock_. I guess we need to 
find a better or new name for it but I don't see where having two different 
locks buys us much since they are both really just used to sync on 
administration of the IW. I personally also don't see why it would buys us 
anything in terms of concurrency. WDYT



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14641) PeerSync, remove canHandleVersionRanges check

2020-08-10 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174251#comment-17174251
 ] 

Cao Manh Dat commented on SOLR-14641:
-

bq. I disagree. In general, whoever wishes to introduce a change should own the 
performance testing, no matter who actually does it. Others can volunteer, but 
ultimate obligation should remain with the committer introducing the change.

I said that because I feel that you did not even take a look at the commit, if 
you do you will see that the needs for a perf run here is not neccessary. 

> PeerSync, remove canHandleVersionRanges check
> -
>
> Key: SOLR-14641
> URL: https://issues.apache.org/jira/browse/SOLR-14641
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Fix For: 8.7
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> SOLR-9207 introduces PeerSync with updates range which committed in 6.2 and 
> 7.0. To maintain backward compatibility at the time we introduce an endpoint 
> in RealTimeGetComponent to check whether a node support that feature or not. 
> It served well its purpose and it should be removed to reduce complexity and 
> a request-response trip for asking that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14684) CloudExitableDirectoryReaderTest failing about 25% of the time

2020-08-10 Thread Erick Erickson (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174249#comment-17174249
 ] 

Erick Erickson commented on SOLR-14684:
---

[~caomanhdat] I was able to run 1,000 iterations with the patch over night and 
got no failures, so this looks good!

> CloudExitableDirectoryReaderTest failing about 25% of the time
> --
>
> Key: SOLR-14684
> URL: https://issues.apache.org/jira/browse/SOLR-14684
> Project: Solr
>  Issue Type: Test
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Tests
>Affects Versions: master (9.0)
>Reporter: Erick Erickson
>Priority: Major
> Attachments: stdout
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> If I beast this on my local machine, it fails (non reproducibly of course) 
> about 1/4 of the time. Log attached. The test itself hasn't changed in 11 
> months or so.
> It looks like occasionally the calls throw an error rather than return 
> partial results with a message: "Time allowed to handle this request 
> exceeded:[]".
> It's been failing very intermittently for a couple of years, but the failure 
> rate really picked up in the last couple of weeks. IDK whether the failures 
> prior to the last couple of weeks are the same root cause.
> I'll do some spelunking to see if I can pinpoint the commit that made this 
> happen, but it'll take a while.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14641) PeerSync, remove canHandleVersionRanges check

2020-08-10 Thread Ishan Chattopadhyaya (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174244#comment-17174244
 ] 

Ishan Chattopadhyaya commented on SOLR-14641:
-

bq. But that quite non-sense to me from the point of who did the commit to do 
performance test
I disagree. In general, whoever wishes to introduce a change should own the 
performance testing, no matter who actually does it. Others can volunteer, but 
ultimate obligation should remain with the committer introducing the change.

bq. this change just basically remove deprecated code rather than optimization. 
If you are *confident* this is just dead code removal, please feel free to go 
ahead _with this one_. Thanks for the clarification!

> PeerSync, remove canHandleVersionRanges check
> -
>
> Key: SOLR-14641
> URL: https://issues.apache.org/jira/browse/SOLR-14641
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Fix For: 8.7
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> SOLR-9207 introduces PeerSync with updates range which committed in 6.2 and 
> 7.0. To maintain backward compatibility at the time we introduce an endpoint 
> in RealTimeGetComponent to check whether a node support that feature or not. 
> It served well its purpose and it should be removed to reduce complexity and 
> a request-response trip for asking that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14641) PeerSync, remove canHandleVersionRanges check

2020-08-10 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174241#comment-17174241
 ] 

Cao Manh Dat commented on SOLR-14641:
-

But that quite non-sense to me from the point of who did the commit to do 
performance test for this one since this change just basically remove 
deprecated code rather than optimization. Basically what we used to do here is
 * asking nodes wether they support versionRanges or not
 * if true (this is the default value since 7.0) go with versionRanges handling 
(instead of concerte versions).

Changes made by this issue is
 * always go with verisionRanges since we know that all other nodes can support 
that so it quite wasteful to ask first.

So if there are any performance regression it already happen long time ago.

Anyway I'm ok with revert the change and letting your benchmark work finish if 
that makes thing easier.

 

> PeerSync, remove canHandleVersionRanges check
> -
>
> Key: SOLR-14641
> URL: https://issues.apache.org/jira/browse/SOLR-14641
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Fix For: 8.7
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> SOLR-9207 introduces PeerSync with updates range which committed in 6.2 and 
> 7.0. To maintain backward compatibility at the time we introduce an endpoint 
> in RealTimeGetComponent to check whether a node support that feature or not. 
> It served well its purpose and it should be removed to reduce complexity and 
> a request-response trip for asking that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] sigram commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for placement plugin interface

2020-08-10 Thread GitBox


sigram commented on a change in pull request #1684:
URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r467810904



##
File path: 
solr/core/src/java/org/apache/solr/cluster/placement/PropertyKeyFactory.java
##
@@ -0,0 +1,61 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.cluster.placement;
+
+/**
+ * Factory used by the plugin to create property keys to request property 
values from Solr.
+ *
+ * Building of a {@link PropertyKey} requires specifying the target (context) 
from which the value of that key should be
+ * obtained. This is done by specifying the appropriate {@link 
PropertyValueSource}.
+ * For clarity, when only a single type of target is acceptable, the 
corresponding subtype of {@link PropertyValueSource} is used instead
+ * (for example {@link Node}).
+ */
+public interface PropertyKeyFactory {
+  /**
+   * Returns a property key to request the number of cores on a {@link Node}.
+   */
+  PropertyKey createCoreCountKey(Node node);
+
+  /**
+   * Returns a property key to request disk related info on a {@link Node}.
+   */
+  PropertyKey createDiskInfoKey(Node node);
+
+  /**
+   * Returns a property key to request the value of a system property on a 
{@link Node}.
+   * @param systemPropertyName the name of the system property to retrieve.
+   */
+  PropertyKey createSystemPropertyKey(Node node, String systemPropertyName);
+
+  /**
+   * Returns a property key to request the value of a metric.
+   *
+   * Not all metrics make sense everywhere, but metrics can be applied to 
different objects. For example
+   * SEARCHER.searcher.indexCommitSize would make sense for a 
given replica of a given shard of a given collection,
+   * and possibly in other contexts.
+   *
+   * @param metricSource The registry of the metric. For example a specific 
{@link Replica}.
+   * @param metricName for example 
SEARCHER.searcher.indexCommitSize.
+   */
+  PropertyKey createMetricKey(PropertyValueSource metricSource, String 
metricName);

Review comment:
   One node usually hosts many replicas. Each of these replicas has a 
unique registry name, in the form of `solr.core.`, so we could 
build PropertyKey from Replica because all components of the full metrics name 
are known.
   
   This is not the case with `node`, `jvm` and `jetty` - I think we need to 
explicitly specify the registry name in these cases.
   
   (Edit: or implement a PropertyValueSource that is a facade for registry 
name, to keep the API here consistent)





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-14641) PeerSync, remove canHandleVersionRanges check

2020-08-10 Thread Ishan Chattopadhyaya (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174239#comment-17174239
 ] 

Ishan Chattopadhyaya edited comment on SOLR-14641 at 8/10/20, 10:41 AM:


bq. It doesn't make sense to asking everyone do a dedicated performance test 
before and after their commits.
I am not requesting performance tests for every commit. But for those that 
affect the default code path for all/most users. PeerSync is enabled by 
default, as an example.

bq. I believe the right way to ensure performance is coming up with something 
like lucene bench, so every downgrade and upgrade will be recorded and can be 
watched (per multiple commits).
I totally agree, and that is where I'm going with 
https://github.com/thesearchstack/solr-bench. However, in the absence of that, 
there is absolutely no reason why we shouldn't perform performance testing 
manually before subjecting our users to the changes. I don't want us to repeat 
what happened with SOLR-14665 (where the commit happened without any 
performance testing, the issue was released and regression was caught only 
after the release. And what is worse is that a bugfix release has still not 
happened for that).


was (Author: ichattopadhyaya):
bq. It doesn't make sense to asking everyone do a dedicated performance test 
before and after their commits.
I am not requesting performance tests for every commit. But for those that 
affect the default code path for all/most users. PeerSync is enabled by 
default, as an example.

bq. I believe the right way to ensure performance is coming up with something 
like lucene bench, so every downgrade and upgrade will be recorded and can be 
watched (per multiple commits).
I totally agree, and that is where I'm going with 
https://github.com/thesearchstack/solr-bench. However, in the absence of that, 
there is absolutely no reason why we shouldn't perform performance testing 
manually before subjecting our users to the changes. I don't want us to repeat 
what happened with SOLR-14665.

> PeerSync, remove canHandleVersionRanges check
> -
>
> Key: SOLR-14641
> URL: https://issues.apache.org/jira/browse/SOLR-14641
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Fix For: 8.7
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> SOLR-9207 introduces PeerSync with updates range which committed in 6.2 and 
> 7.0. To maintain backward compatibility at the time we introduce an endpoint 
> in RealTimeGetComponent to check whether a node support that feature or not. 
> It served well its purpose and it should be removed to reduce complexity and 
> a request-response trip for asking that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14641) PeerSync, remove canHandleVersionRanges check

2020-08-10 Thread Ishan Chattopadhyaya (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174239#comment-17174239
 ] 

Ishan Chattopadhyaya commented on SOLR-14641:
-

bq. It doesn't make sense to asking everyone do a dedicated performance test 
before and after their commits.
I am not requesting performance tests for every commit. But for those that 
affect the default code path for all/most users. PeerSync is enabled by 
default, as an example.

bq. I believe the right way to ensure performance is coming up with something 
like lucene bench, so every downgrade and upgrade will be recorded and can be 
watched (per multiple commits).
I totally agree, and that is where I'm going with 
https://github.com/thesearchstack/solr-bench. However, in the absence of that, 
there is absolutely no reason why we shouldn't perform performance testing 
manually before subjecting our users to the changes. I don't want us to repeat 
what happened with SOLR-14665.

> PeerSync, remove canHandleVersionRanges check
> -
>
> Key: SOLR-14641
> URL: https://issues.apache.org/jira/browse/SOLR-14641
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Fix For: 8.7
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> SOLR-9207 introduces PeerSync with updates range which committed in 6.2 and 
> 7.0. To maintain backward compatibility at the time we introduce an endpoint 
> in RealTimeGetComponent to check whether a node support that feature or not. 
> It served well its purpose and it should be removed to reduce complexity and 
> a request-response trip for asking that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14354) HttpShardHandler send requests in async

2020-08-10 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174236#comment-17174236
 ] 

Cao Manh Dat commented on SOLR-14354:
-

Ok then I will try my best to run it.

> HttpShardHandler send requests in async
> ---
>
> Key: SOLR-14354
> URL: https://issues.apache.org/jira/browse/SOLR-14354
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Fix For: master (9.0), 8.7
>
> Attachments: image-2020-03-23-10-04-08-399.png, 
> image-2020-03-23-10-09-10-221.png, image-2020-03-23-10-12-00-661.png
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> h2. 1. Current approach (problem) of Solr
> Below is the diagram describe the model on how currently handling a request.
> !image-2020-03-23-10-04-08-399.png!
> The main-thread that handles the search requests, will submit n requests (n 
> equals to number of shards) to an executor. So each request will correspond 
> to a thread, after sending a request that thread basically do nothing just 
> waiting for response from other side. That thread will be swapped out and CPU 
> will try to handle another thread (this is called context switch, CPU will 
> save the context of the current thread and switch to another one). When some 
> data (not all) come back, that thread will be called to parsing these data, 
> then it will wait until more data come back. So there will be lots of context 
> switching in CPU. That is quite inefficient on using threads.Basically we 
> want less threads and most of them must busy all the time, because threads 
> are not free as well as context switching. That is the main idea behind 
> everything, like executor
> h2. 2. Async call of Jetty HttpClient
> Jetty HttpClient offers async API like this.
> {code:java}
> httpClient.newRequest("http://domain.com/path;)
> // Add request hooks
> .onRequestQueued(request -> { ... })
> .onRequestBegin(request -> { ... })
> // Add response hooks
> .onResponseBegin(response -> { ... })
> .onResponseHeaders(response -> { ... })
> .onResponseContent((response, buffer) -> { ... })
> .send(result -> { ... }); {code}
> Therefore after calling {{send()}} the thread will return immediately without 
> any block. Then when the client received the header from other side, it will 
> call {{onHeaders()}} listeners. When the client received some {{byte[]}} (not 
> all response) from the data it will call {{onContent(buffer)}} listeners. 
> When everything finished it will call {{onComplete}} listeners. One main 
> thing that will must notice here is all listeners should finish quick, if the 
> listener block, all further data of that request won’t be handled until the 
> listener finish.
> h2. 3. Solution 1: Sending requests async but spin one thread per response
>  Jetty HttpClient already provides several listeners, one of them is 
> InputStreamResponseListener. This is how it is get used
> {code:java}
> InputStreamResponseListener listener = new InputStreamResponseListener();
> client.newRequest(...).send(listener);
> // Wait for the response headers to arrive
> Response response = listener.get(5, TimeUnit.SECONDS);
> if (response.getStatus() == 200) {
>   // Obtain the input stream on the response content
>   try (InputStream input = listener.getInputStream()) {
> // Read the response content
>   }
> } {code}
> In this case, there will be 2 thread
>  * one thread trying to read the response content from InputStream
>  * one thread (this is a short-live task) feeding content to above 
> InputStream whenever some byte[] is available. Note that if this thread 
> unable to feed data into InputStream, this thread will wait.
> By using this one, the model of HttpShardHandler can be written into 
> something like this
> {code:java}
> handler.sendReq(req, (is) -> {
>   executor.submit(() ->
> try (is) {
>   // Read the content from InputStream
> }
>   )
> }) {code}
>  The first diagram will be changed into this
> !image-2020-03-23-10-09-10-221.png!
> Notice that although “sending req to shard1” is wide, it won’t take long time 
> since sending req is a very quick operation. With this operation, handling 
> threads won’t be spin up until first bytes are sent back. Notice that in this 
> approach we still have active threads waiting for more data from InputStream
> h2. 4. Solution 2: Buffering data and handle it inside jetty’s thread.
> Jetty have another listener called BufferingResponseListener. This is how it 
> is get used
> {code:java}
> client.newRequest(...).send(new BufferingResponseListener() {
>   public void onComplete(Result result) {
> try {
>   byte[] response = getContent();
>   //handling response
> }
>   }
> 

[jira] [Commented] (SOLR-14354) HttpShardHandler send requests in async

2020-08-10 Thread Ishan Chattopadhyaya (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174234#comment-17174234
 ] 

Ishan Chattopadhyaya commented on SOLR-14354:
-

bq. Ishan Chattopadhyaya, fair enough, do you want to do the benchmark?
Sorry :-( I can work on setting up some automated benchmarking (basically, 
automated runs of https://github.com/thesearchstack/solr-bench), but I won't be 
able to finish this soon enough before 8.7 due to client priorities. As of now, 
I'm actively and aggressively working on a similar issue on a higher priority, 
SOLR-13933, and will set up both of them together on a public server once this 
is done.

> HttpShardHandler send requests in async
> ---
>
> Key: SOLR-14354
> URL: https://issues.apache.org/jira/browse/SOLR-14354
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Fix For: master (9.0), 8.7
>
> Attachments: image-2020-03-23-10-04-08-399.png, 
> image-2020-03-23-10-09-10-221.png, image-2020-03-23-10-12-00-661.png
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> h2. 1. Current approach (problem) of Solr
> Below is the diagram describe the model on how currently handling a request.
> !image-2020-03-23-10-04-08-399.png!
> The main-thread that handles the search requests, will submit n requests (n 
> equals to number of shards) to an executor. So each request will correspond 
> to a thread, after sending a request that thread basically do nothing just 
> waiting for response from other side. That thread will be swapped out and CPU 
> will try to handle another thread (this is called context switch, CPU will 
> save the context of the current thread and switch to another one). When some 
> data (not all) come back, that thread will be called to parsing these data, 
> then it will wait until more data come back. So there will be lots of context 
> switching in CPU. That is quite inefficient on using threads.Basically we 
> want less threads and most of them must busy all the time, because threads 
> are not free as well as context switching. That is the main idea behind 
> everything, like executor
> h2. 2. Async call of Jetty HttpClient
> Jetty HttpClient offers async API like this.
> {code:java}
> httpClient.newRequest("http://domain.com/path;)
> // Add request hooks
> .onRequestQueued(request -> { ... })
> .onRequestBegin(request -> { ... })
> // Add response hooks
> .onResponseBegin(response -> { ... })
> .onResponseHeaders(response -> { ... })
> .onResponseContent((response, buffer) -> { ... })
> .send(result -> { ... }); {code}
> Therefore after calling {{send()}} the thread will return immediately without 
> any block. Then when the client received the header from other side, it will 
> call {{onHeaders()}} listeners. When the client received some {{byte[]}} (not 
> all response) from the data it will call {{onContent(buffer)}} listeners. 
> When everything finished it will call {{onComplete}} listeners. One main 
> thing that will must notice here is all listeners should finish quick, if the 
> listener block, all further data of that request won’t be handled until the 
> listener finish.
> h2. 3. Solution 1: Sending requests async but spin one thread per response
>  Jetty HttpClient already provides several listeners, one of them is 
> InputStreamResponseListener. This is how it is get used
> {code:java}
> InputStreamResponseListener listener = new InputStreamResponseListener();
> client.newRequest(...).send(listener);
> // Wait for the response headers to arrive
> Response response = listener.get(5, TimeUnit.SECONDS);
> if (response.getStatus() == 200) {
>   // Obtain the input stream on the response content
>   try (InputStream input = listener.getInputStream()) {
> // Read the response content
>   }
> } {code}
> In this case, there will be 2 thread
>  * one thread trying to read the response content from InputStream
>  * one thread (this is a short-live task) feeding content to above 
> InputStream whenever some byte[] is available. Note that if this thread 
> unable to feed data into InputStream, this thread will wait.
> By using this one, the model of HttpShardHandler can be written into 
> something like this
> {code:java}
> handler.sendReq(req, (is) -> {
>   executor.submit(() ->
> try (is) {
>   // Read the content from InputStream
> }
>   )
> }) {code}
>  The first diagram will be changed into this
> !image-2020-03-23-10-09-10-221.png!
> Notice that although “sending req to shard1” is wide, it won’t take long time 
> since sending req is a very quick operation. With this operation, handling 
> threads won’t be spin up until first bytes are sent back. Notice that in this 
> 

[jira] [Comment Edited] (SOLR-14641) PeerSync, remove canHandleVersionRanges check

2020-08-10 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174228#comment-17174228
 ] 

Cao Manh Dat edited comment on SOLR-14641 at 8/10/20, 10:29 AM:


I believe the right way to ensure performance is coming up with something like 
lucene bench, so every downgrade and upgrade will be recorded and can be 
watched (per multiple commits). It doesn't make sense to asking everyone do a 
dedicated performance test before and after their commits.


was (Author: caomanhdat):
I believe the right way to ensure performance is coming up with something like 
lucene bench, so every downgrade and upgrade will be recorded and can be 
watched. It doesn't make sense to asking everyone do a dedicated performance 
test before and after their commits.

> PeerSync, remove canHandleVersionRanges check
> -
>
> Key: SOLR-14641
> URL: https://issues.apache.org/jira/browse/SOLR-14641
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Fix For: 8.7
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> SOLR-9207 introduces PeerSync with updates range which committed in 6.2 and 
> 7.0. To maintain backward compatibility at the time we introduce an endpoint 
> in RealTimeGetComponent to check whether a node support that feature or not. 
> It served well its purpose and it should be removed to reduce complexity and 
> a request-response trip for asking that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14641) PeerSync, remove canHandleVersionRanges check

2020-08-10 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174228#comment-17174228
 ] 

Cao Manh Dat commented on SOLR-14641:
-

I believe the right way to ensure performance is coming up with something like 
lucene bench, so every downgrade and upgrade will be recorded and can be 
watched. It doesn't make sense to asking everyone do a dedicated performance 
test before and after their commits.

> PeerSync, remove canHandleVersionRanges check
> -
>
> Key: SOLR-14641
> URL: https://issues.apache.org/jira/browse/SOLR-14641
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Fix For: 8.7
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> SOLR-9207 introduces PeerSync with updates range which committed in 6.2 and 
> 7.0. To maintain backward compatibility at the time we introduce an endpoint 
> in RealTimeGetComponent to check whether a node support that feature or not. 
> It served well its purpose and it should be removed to reduce complexity and 
> a request-response trip for asking that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14641) PeerSync, remove canHandleVersionRanges check

2020-08-10 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174226#comment-17174226
 ] 

Cao Manh Dat commented on SOLR-14641:
-

I kinda hesitate to do such performance testing for this one, what is the 
reason behind that? This issue just simply remove codepath that no longer used.

> PeerSync, remove canHandleVersionRanges check
> -
>
> Key: SOLR-14641
> URL: https://issues.apache.org/jira/browse/SOLR-14641
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Fix For: 8.7
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> SOLR-9207 introduces PeerSync with updates range which committed in 6.2 and 
> 7.0. To maintain backward compatibility at the time we introduce an endpoint 
> in RealTimeGetComponent to check whether a node support that feature or not. 
> It served well its purpose and it should be removed to reduce complexity and 
> a request-response trip for asking that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] sigram commented on a change in pull request #1684: SOLR-14613: strongly typed initial proposal for placement plugin interface

2020-08-10 Thread GitBox


sigram commented on a change in pull request #1684:
URL: https://github.com/apache/lucene-solr/pull/1684#discussion_r467801413



##
File path: solr/core/src/java/org/apache/solr/cluster/placement/Cluster.java
##
@@ -0,0 +1,53 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.cluster.placement;
+
+import java.io.IOException;
+import java.util.Optional;
+import java.util.Set;
+
+/**
+ * A representation of the (initial) cluster state, providing information 
on which nodes are part of the cluster and a way
+ * to get to more detailed info.
+ *
+ * This instance can also be used as a {@link PropertyValueSource} if 
{@link PropertyKey}'s need to be specified with
+ * a global cluster target.
+ */
+public interface Cluster extends PropertyValueSource {
+  /**
+   * @return current set of live nodes. Never null, never empty 
(Solr wouldn't call the plugin if empty
+   * since no useful work could then be done).
+   */
+  Set getLiveNodes();
+
+  /**
+   * Returns info about the given collection if one exists. Because it is 
not expected for plugins to request info about
+   * a large number of collections, requests can only be made one by one.
+   *
+   * This is also the reason we do not return a {@link java.util.Map} or 
{@link Set} of {@link SolrCollection}'s here: it would be
+   * wasteful to fetch all data and fill such a map when plugin code likely 
needs info about at most one or two collections.
+   */
+  Optional getCollection(String collectionName) throws 
IOException;
+
+  /**
+   * Allows getting all {@link SolrCollection} present in the cluster.
+   *
+   * WARNING: this call might be extremely inefficient on large 
clusters. Usage is discouraged.
+   */
+  Set getAllCollections();

Review comment:
   I meant just the list of names ... `Collection`, otherwise I 
agree it can be very inefficient.

##
File path: 
solr/core/src/java/org/apache/solr/cluster/placement/PlacementPlanFactory.java
##
@@ -0,0 +1,52 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.cluster.placement;
+
+import java.util.Set;
+
+/**
+ * Allows plugins to create {@link PlacementPlan}s telling the Solr layer 
where to create replicas following the processing of
+ * a {@link PlacementRequest}. The Solr layer can (and will) check that the 
{@link PlacementPlan} conforms to the {@link PlacementRequest} (and
+ * if it does not, the requested operation will fail).
+ */
+public interface PlacementPlanFactory {
+  /**
+   * Creates a {@link PlacementPlan} for adding a new collection and its 
replicas.
+   *
+   * This is in support of {@link 
org.apache.solr.cloud.api.collections.CreateCollectionCmd}.
+   */
+  PlacementPlan 
createPlacementPlanNewCollection(CreateNewCollectionPlacementRequest request, 
String CollectionName, Set replicaPlacements);
+
+  /**
+   * Creates a {@link PlacementPlan} for adding replicas to a given shard 
of an existing collection.
+   *
+   * This is in support (directly or indirectly) of {@link 
org.apache.solr.cloud.api.collections.AddReplicaCmd},
+   * {@link org.apache.solr.cloud.api.collections.CreateShardCmd}, {@link 
org.apache.solr.cloud.api.collections.ReplaceNodeCmd},
+   * {@link org.apache.solr.cloud.api.collections.MoveReplicaCmd}, {@link 
org.apache.solr.cloud.api.collections.SplitShardCmd},
+   * {@link org.apache.solr.cloud.api.collections.RestoreCmd} and {@link 

[jira] [Commented] (SOLR-14641) PeerSync, remove canHandleVersionRanges check

2020-08-10 Thread Ishan Chattopadhyaya (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174225#comment-17174225
 ] 

Ishan Chattopadhyaya commented on SOLR-14641:
-

Since this is a change that affects all users by default, I would still prefer 
that we have performance testing numbers to make sure there is no performance 
regression.

> PeerSync, remove canHandleVersionRanges check
> -
>
> Key: SOLR-14641
> URL: https://issues.apache.org/jira/browse/SOLR-14641
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Fix For: 8.7
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> SOLR-9207 introduces PeerSync with updates range which committed in 6.2 and 
> 7.0. To maintain backward compatibility at the time we introduce an endpoint 
> in RealTimeGetComponent to check whether a node support that feature or not. 
> It served well its purpose and it should be removed to reduce complexity and 
> a request-response trip for asking that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14354) HttpShardHandler send requests in async

2020-08-10 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174224#comment-17174224
 ] 

Cao Manh Dat commented on SOLR-14354:
-

[~ichattopadhyaya], fair enough, do you want to do the benchmark?

> HttpShardHandler send requests in async
> ---
>
> Key: SOLR-14354
> URL: https://issues.apache.org/jira/browse/SOLR-14354
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Fix For: master (9.0), 8.7
>
> Attachments: image-2020-03-23-10-04-08-399.png, 
> image-2020-03-23-10-09-10-221.png, image-2020-03-23-10-12-00-661.png
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> h2. 1. Current approach (problem) of Solr
> Below is the diagram describe the model on how currently handling a request.
> !image-2020-03-23-10-04-08-399.png!
> The main-thread that handles the search requests, will submit n requests (n 
> equals to number of shards) to an executor. So each request will correspond 
> to a thread, after sending a request that thread basically do nothing just 
> waiting for response from other side. That thread will be swapped out and CPU 
> will try to handle another thread (this is called context switch, CPU will 
> save the context of the current thread and switch to another one). When some 
> data (not all) come back, that thread will be called to parsing these data, 
> then it will wait until more data come back. So there will be lots of context 
> switching in CPU. That is quite inefficient on using threads.Basically we 
> want less threads and most of them must busy all the time, because threads 
> are not free as well as context switching. That is the main idea behind 
> everything, like executor
> h2. 2. Async call of Jetty HttpClient
> Jetty HttpClient offers async API like this.
> {code:java}
> httpClient.newRequest("http://domain.com/path;)
> // Add request hooks
> .onRequestQueued(request -> { ... })
> .onRequestBegin(request -> { ... })
> // Add response hooks
> .onResponseBegin(response -> { ... })
> .onResponseHeaders(response -> { ... })
> .onResponseContent((response, buffer) -> { ... })
> .send(result -> { ... }); {code}
> Therefore after calling {{send()}} the thread will return immediately without 
> any block. Then when the client received the header from other side, it will 
> call {{onHeaders()}} listeners. When the client received some {{byte[]}} (not 
> all response) from the data it will call {{onContent(buffer)}} listeners. 
> When everything finished it will call {{onComplete}} listeners. One main 
> thing that will must notice here is all listeners should finish quick, if the 
> listener block, all further data of that request won’t be handled until the 
> listener finish.
> h2. 3. Solution 1: Sending requests async but spin one thread per response
>  Jetty HttpClient already provides several listeners, one of them is 
> InputStreamResponseListener. This is how it is get used
> {code:java}
> InputStreamResponseListener listener = new InputStreamResponseListener();
> client.newRequest(...).send(listener);
> // Wait for the response headers to arrive
> Response response = listener.get(5, TimeUnit.SECONDS);
> if (response.getStatus() == 200) {
>   // Obtain the input stream on the response content
>   try (InputStream input = listener.getInputStream()) {
> // Read the response content
>   }
> } {code}
> In this case, there will be 2 thread
>  * one thread trying to read the response content from InputStream
>  * one thread (this is a short-live task) feeding content to above 
> InputStream whenever some byte[] is available. Note that if this thread 
> unable to feed data into InputStream, this thread will wait.
> By using this one, the model of HttpShardHandler can be written into 
> something like this
> {code:java}
> handler.sendReq(req, (is) -> {
>   executor.submit(() ->
> try (is) {
>   // Read the content from InputStream
> }
>   )
> }) {code}
>  The first diagram will be changed into this
> !image-2020-03-23-10-09-10-221.png!
> Notice that although “sending req to shard1” is wide, it won’t take long time 
> since sending req is a very quick operation. With this operation, handling 
> threads won’t be spin up until first bytes are sent back. Notice that in this 
> approach we still have active threads waiting for more data from InputStream
> h2. 4. Solution 2: Buffering data and handle it inside jetty’s thread.
> Jetty have another listener called BufferingResponseListener. This is how it 
> is get used
> {code:java}
> client.newRequest(...).send(new BufferingResponseListener() {
>   public void onComplete(Result result) {
> try {
>   byte[] response = getContent();
>   

[jira] [Commented] (SOLR-14641) PeerSync, remove canHandleVersionRanges check

2020-08-10 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174220#comment-17174220
 ] 

Cao Manh Dat commented on SOLR-14641:
-

[~ichattopadhyaya] I don't think this will be a seeable boost in time, since 
this request is very lightweight.

> PeerSync, remove canHandleVersionRanges check
> -
>
> Key: SOLR-14641
> URL: https://issues.apache.org/jira/browse/SOLR-14641
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Fix For: 8.7
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> SOLR-9207 introduces PeerSync with updates range which committed in 6.2 and 
> 7.0. To maintain backward compatibility at the time we introduce an endpoint 
> in RealTimeGetComponent to check whether a node support that feature or not. 
> It served well its purpose and it should be removed to reduce complexity and 
> a request-response trip for asking that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14354) HttpShardHandler send requests in async

2020-08-10 Thread Ishan Chattopadhyaya (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174221#comment-17174221
 ] 

Ishan Chattopadhyaya commented on SOLR-14354:
-

For a change like this, I would like to see performance numbers. Unless we have 
that, I am not comfortable with releasing with this feature.
If you would like to use https://github.com/thesearchstack/solr-bench, I can 
offer help and assistance.
In the absence of performance numbers, I shall be inclined to request a revert 
of this change (veto).

> HttpShardHandler send requests in async
> ---
>
> Key: SOLR-14354
> URL: https://issues.apache.org/jira/browse/SOLR-14354
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Fix For: master (9.0), 8.7
>
> Attachments: image-2020-03-23-10-04-08-399.png, 
> image-2020-03-23-10-09-10-221.png, image-2020-03-23-10-12-00-661.png
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> h2. 1. Current approach (problem) of Solr
> Below is the diagram describe the model on how currently handling a request.
> !image-2020-03-23-10-04-08-399.png!
> The main-thread that handles the search requests, will submit n requests (n 
> equals to number of shards) to an executor. So each request will correspond 
> to a thread, after sending a request that thread basically do nothing just 
> waiting for response from other side. That thread will be swapped out and CPU 
> will try to handle another thread (this is called context switch, CPU will 
> save the context of the current thread and switch to another one). When some 
> data (not all) come back, that thread will be called to parsing these data, 
> then it will wait until more data come back. So there will be lots of context 
> switching in CPU. That is quite inefficient on using threads.Basically we 
> want less threads and most of them must busy all the time, because threads 
> are not free as well as context switching. That is the main idea behind 
> everything, like executor
> h2. 2. Async call of Jetty HttpClient
> Jetty HttpClient offers async API like this.
> {code:java}
> httpClient.newRequest("http://domain.com/path;)
> // Add request hooks
> .onRequestQueued(request -> { ... })
> .onRequestBegin(request -> { ... })
> // Add response hooks
> .onResponseBegin(response -> { ... })
> .onResponseHeaders(response -> { ... })
> .onResponseContent((response, buffer) -> { ... })
> .send(result -> { ... }); {code}
> Therefore after calling {{send()}} the thread will return immediately without 
> any block. Then when the client received the header from other side, it will 
> call {{onHeaders()}} listeners. When the client received some {{byte[]}} (not 
> all response) from the data it will call {{onContent(buffer)}} listeners. 
> When everything finished it will call {{onComplete}} listeners. One main 
> thing that will must notice here is all listeners should finish quick, if the 
> listener block, all further data of that request won’t be handled until the 
> listener finish.
> h2. 3. Solution 1: Sending requests async but spin one thread per response
>  Jetty HttpClient already provides several listeners, one of them is 
> InputStreamResponseListener. This is how it is get used
> {code:java}
> InputStreamResponseListener listener = new InputStreamResponseListener();
> client.newRequest(...).send(listener);
> // Wait for the response headers to arrive
> Response response = listener.get(5, TimeUnit.SECONDS);
> if (response.getStatus() == 200) {
>   // Obtain the input stream on the response content
>   try (InputStream input = listener.getInputStream()) {
> // Read the response content
>   }
> } {code}
> In this case, there will be 2 thread
>  * one thread trying to read the response content from InputStream
>  * one thread (this is a short-live task) feeding content to above 
> InputStream whenever some byte[] is available. Note that if this thread 
> unable to feed data into InputStream, this thread will wait.
> By using this one, the model of HttpShardHandler can be written into 
> something like this
> {code:java}
> handler.sendReq(req, (is) -> {
>   executor.submit(() ->
> try (is) {
>   // Read the content from InputStream
> }
>   )
> }) {code}
>  The first diagram will be changed into this
> !image-2020-03-23-10-09-10-221.png!
> Notice that although “sending req to shard1” is wide, it won’t take long time 
> since sending req is a very quick operation. With this operation, handling 
> threads won’t be spin up until first bytes are sent back. Notice that in this 
> approach we still have active threads waiting for more data from InputStream
> h2. 4. Solution 2: Buffering data and handle it inside 

[jira] [Commented] (SOLR-14641) PeerSync, remove canHandleVersionRanges check

2020-08-10 Thread Ishan Chattopadhyaya (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174214#comment-17174214
 ] 

Ishan Chattopadhyaya commented on SOLR-14641:
-

bq.  it should be removed to [...] a request-response trip for asking that.
[~caomanhdat], based on your comment, it seems this is also a performance 
optimization. What is the level of performance testing/benchmarking that has 
been done for this issue?



> PeerSync, remove canHandleVersionRanges check
> -
>
> Key: SOLR-14641
> URL: https://issues.apache.org/jira/browse/SOLR-14641
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Fix For: 8.7
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> SOLR-9207 introduces PeerSync with updates range which committed in 6.2 and 
> 7.0. To maintain backward compatibility at the time we introduce an endpoint 
> in RealTimeGetComponent to check whether a node support that feature or not. 
> It served well its purpose and it should be removed to reduce complexity and 
> a request-response trip for asking that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] noblepaul commented on pull request #1730: SOLR-14680: Provide an implementation for the new SolrCluster API

2020-08-10 Thread GitBox


noblepaul commented on pull request #1730:
URL: https://github.com/apache/lucene-solr/pull/1730#issuecomment-671263417


   I realized that I had to tweak the APIs after writing an implementation. I 
have updated the API-only PR #1694 to reflect the latest




This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-6152) Pre-populating values into search parameters on the query page of solr admin

2020-08-10 Thread Jakob Furrer (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-6152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174211#comment-17174211
 ] 

Jakob Furrer commented on SOLR-6152:


New Patch:
* Regression regarding the raw value field has been fixed.
* Value in the 'qt' field is correctly applied to the browser address bar url 
and the REST response url.
   Note: When the 'qt' value starts with the '/' character, it is used in the 
resulting REST response url path (i.e. in front of the question mark).
   Otherwise, the value is appended like any other a parameter (i.e. as 
"=qt_value").
   This was is consistent with prior behavior.
* The 'indent off' checkbox is persisted.
   Note that the checkbox displays the *inverse* value of the 'indent' 
parameter, i.e. "=false" ticks the checkbox.
* Regression fixed: When "Basic authentication plugin" was used (configured in 
security.json), an endless loop of redirects between login-page and query-page 
occurred.

> Pre-populating values into search parameters on the query page of solr admin
> 
>
> Key: SOLR-6152
> URL: https://issues.apache.org/jira/browse/SOLR-6152
> Project: Solr
>  Issue Type: Improvement
>  Components: Admin UI
>Affects Versions: 4.3.1
>Reporter: Dmitry Kan
>Assignee: Jan Høydahl
>Priority: Major
> Attachments: SOLR-6152.patch, SOLR-6152.patch, SOLR-6152.patch, 
> SOLR-6152.patch, copy_url_to_clipboard.png, copy_url_to_clipboard_v2.png, 
> prefilling_and_extending_the_multivalue_parameter_fq.png, 
> prepoluate_query_parameters_query_page.bmp
>
>
> In some use cases, it is highly desirable to be able to pre-populate the 
> query page of solr admin with specific values.
> In particular use case of mine, the solr admin user must pass a date range 
> value without which the query would fail.
> It isn't easy to remember the value format for non-solr experts, so I would 
> like to have a way of hooking that value "example" into the query page.
> See the screenshot attached, where I have inserted the fq parameter with date 
> range into the Raw Query Parameters.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-6152) Pre-populating values into search parameters on the query page of solr admin

2020-08-10 Thread Jakob Furrer (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-6152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Furrer updated SOLR-6152:
---
Attachment: SOLR-6152.patch

> Pre-populating values into search parameters on the query page of solr admin
> 
>
> Key: SOLR-6152
> URL: https://issues.apache.org/jira/browse/SOLR-6152
> Project: Solr
>  Issue Type: Improvement
>  Components: Admin UI
>Affects Versions: 4.3.1
>Reporter: Dmitry Kan
>Assignee: Jan Høydahl
>Priority: Major
> Attachments: SOLR-6152.patch, SOLR-6152.patch, SOLR-6152.patch, 
> SOLR-6152.patch, copy_url_to_clipboard.png, copy_url_to_clipboard_v2.png, 
> prefilling_and_extending_the_multivalue_parameter_fq.png, 
> prepoluate_query_parameters_query_page.bmp
>
>
> In some use cases, it is highly desirable to be able to pre-populate the 
> query page of solr admin with specific values.
> In particular use case of mine, the solr admin user must pass a date range 
> value without which the query would fail.
> It isn't easy to remember the value format for non-solr experts, so I would 
> like to have a way of hooking that value "example" into the query page.
> See the screenshot attached, where I have inserted the fq parameter with date 
> range into the Raw Query Parameters.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9448) Make an equivalent to Ant's "run" target for Luke module

2020-08-10 Thread Tomoko Uchida (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174199#comment-17174199
 ] 

Tomoko Uchida commented on LUCENE-9448:
---

{quote}I always thought Luke is still a "stand-alone" tool so I suggested 
dependency assembly for a stand-alone tool
{quote}
I'm not sure if it is related... when Luke was integrated into Lucene, my very 
first suggestion was creating a stand-alone Luke app (zip/tar) that is 
separately distributed from Lucene, just like Solr; it was rejected and I did 
not argue about it. I just remembered that.

> Make an equivalent to Ant's "run" target for Luke module
> 
>
> Key: LUCENE-9448
> URL: https://issues.apache.org/jira/browse/LUCENE-9448
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Tomoko Uchida
>Priority: Minor
> Attachments: LUCENE-9448.patch
>
>
> With Ant build, Luke Swing app can be launched by "ant run" after checking 
> out the source code. "ant run" allows developers to immediately see the 
> effects of UI changes without creating the whole zip/tgz package (originally, 
> it was suggested when integrating Luke to Lucene).
> In Gradle, {{:lucene:luke:run}} task would be easily implemented with 
> {{JavaExec}}, I think.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] noblepaul edited a comment on pull request #1730: SOLR-14680: Provide an implementation for the new SolrCluster API

2020-08-10 Thread GitBox


noblepaul edited a comment on pull request #1730:
URL: https://github.com/apache/lucene-solr/pull/1730#issuecomment-671229892


   > Separating out the new lazy implementations into another PR and keeping 
this one for adding interfaces to internal classes would have made reviewing 
easier.
   
   Yeah, that was the other PR #1694 , I have revived it
   
   @murblanc please review it 
   
   >Are there places in the code where currently the concrete classes are used 
and that could be changed to use the interfaces instead? In other words, 
how/where would these interfaces be used?
   
   The current concrete classes do not use/implement these interfaces. These 
interfaces will only be a part of implementations. for instance, the 
`LazySolrCluster` is one of the impl. In the future we should add a couple more
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] noblepaul edited a comment on pull request #1730: SOLR-14680: Provide simple interfaces to our concrete SolrCloud classes

2020-08-10 Thread GitBox


noblepaul edited a comment on pull request #1730:
URL: https://github.com/apache/lucene-solr/pull/1730#issuecomment-671229892


   > Separating out the new lazy implementations into another PR and keeping 
this one for adding interfaces to internal classes would have made reviewing 
easier.
   
   Yeah, that was the other PR #1694 , I have revived it
   
   >Are there places in the code where currently the concrete classes are used 
and that could be changed to use the interfaces instead? In other words, 
how/where would these interfaces be used?
   
   The current concrete classes do not use/implement these interfaces. These 
interfaces will only be a part of implementations. for instance, the 
`LazySolrCluster` is one of the impl. In the future we should add a couple more
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



  1   2   >