Location of Solr 9 Branch

2021-03-02 Thread Phill Campbell
I have just begun investigating Solr source code. Where is the branch for Solr 
9?




Re: NPE in QueryComponent.mergeIds when using timeAllowed and sorting SOLR 8.7

2021-03-01 Thread Phill Campbell
Anyone?

> On Feb 24, 2021, at 7:47 AM, Phill Campbell  
> wrote:
> 
> Last week I switched to Solr 8.7 from a “special” build of Solr 6.6
> 
> The system has a timeout set for querying. I am now seeing this bug.
> 
> https://issues.apache.org/jira/browse/SOLR-14758 
> <https://issues.apache.org/jira/browse/SOLR-14758>
> 
> Max Query Time goes from 1.6 seconds to 20 seconds and affects the entire 
> system for about 2 minutes as reported in New Relic.
> 
> null:java.lang.NullPointerException
>   at 
> org.apache.solr.handler.component.QueryComponent.mergeIds(QueryComponent.java:935)
>   at 
> org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:626)
>   at 
> org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:605)
>   at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:486)
>   at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:214)
>   at org.apache.solr.core.SolrCore.execute(SolrCore.java:2627)
> 
> 
> Can this be fixed in a patch for Solr 8.8? I do not want to have to go back 
> to Solr 6 and reindex the system, that takes 2 days using 180 EMR instances.
> 
> Pease advise. Thank you.



NPE in QueryComponent.mergeIds when using timeAllowed and sorting SOLR 8.7

2021-02-24 Thread Phill Campbell
Last week I switched to Solr 8.7 from a “special” build of Solr 6.6

The system has a timeout set for querying. I am now seeing this bug.

https://issues.apache.org/jira/browse/SOLR-14758 


Max Query Time goes from 1.6 seconds to 20 seconds and affects the entire 
system for about 2 minutes as reported in New Relic.

null:java.lang.NullPointerException
at 
org.apache.solr.handler.component.QueryComponent.mergeIds(QueryComponent.java:935)
at 
org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:626)
at 
org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:605)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:486)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:214)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2627)


Can this be fixed in a patch for Solr 8.8? I do not want to have to go back to 
Solr 6 and reindex the system, that takes 2 days using 180 EMR instances.

Pease advise. Thank you.

NPE in QueryComponent.mergeIds when using timeAllowed and sorting SOLR 8.7

2021-02-24 Thread Phill Campbell
Last week I switched to Solr 8.7 from a “special” build of Solr 6.6

The system has a timeout set for querying. I am now seeing this bug.

https://issues.apache.org/jira/browse/SOLR-14758 


Max Query Time goes from 1.6 seconds to 20 seconds and affects the entire 
system for about 2 minutes as reported in New Relic.

null:java.lang.NullPointerException
at 
org.apache.solr.handler.component.QueryComponent.mergeIds(QueryComponent.java:935)
at 
org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:626)
at 
org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:605)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:486)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:214)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2627)


Can this be fixed in a patch for Solr 8.8? I do not want to have to go back to 
Solr 6 and reindex the system, that takes 2 days using 180 EMR instances.

Pease advise. Thank you.

Re: leader election stuck after hosts restarts

2021-01-12 Thread Phill Campbell
Which version of Apache Solr?

> On Jan 12, 2021, at 8:36 AM, Pierre Salagnac  
> wrote:
> 
> Hello,
> We had a stuck leader election for a shard.
> 
> We have collections with 2 shards, each shard has 5 replicas. We have many
> collections but the issue happened for a single shard. Once all host
> restarts completed, this shard was stuck with one replica is "recovery"
> state and all other is "down" state.
> 
> Here is the state of the shard returned by CLUSTERSTATUS command.
>  "replicas":{
>"core_node3":{
>  "core":"_shard1_replica_n1",
>  "base_url":"https://host1:8983/solr;,
>  "node_name":"host1:8983_solr",
>  "state":"recovering",
>  "type":"NRT",
>  "force_set_state":"false"},
>"core_node9":{
>  "core":"_shard1_replica_n6",
>  "base_url":"https://host2:8983/solr;,
>  "node_name":"host2:8983_solr",
>  "state":"down",
>  "type":"NRT",
>  "force_set_state":"false"},
>"core_node26":{
>  "core":"_shard1_replica_n25",
>  "base_url":"https://host3:8983/solr;,
>  "node_name":"host3:8983_solr",
>  "state":"down",
>  "type":"NRT",
>  "force_set_state":"false"},
>"core_node28":{
>  "core":"_shard1_replica_n27",
>  "base_url":"https://host4:8983/solr;,
>  "node_name":"host4:8983_solr",
>  "state":"down",
>  "type":"NRT",
>  "force_set_state":"false"},
>"core_node34":{
>  "core":"_shard1_replica_n33",
>  "base_url":"https://host5:8983/solr;,
>  "node_name":"host5:8983_solr",
>  "state":"down",
>  "type":"NRT",
>  "force_set_state":"false"}}}
> 
> The workarounds to shutdown server host1 with the replica stuck in recovery
> state. This unblocked leader election, the 4 other replicas went active.
> 
> Here is the first error I found in logs related to this shard. It happened
> while shutting a server host3 that was the leader at that time/
> (updateExecutor-5-thread-33908-processing-x:..._shard1_replica_n25
> r:core_node26 null n:... s:shard1) [c:... s:shard1 r:core_node26
> x:..._shard1_replica_n25] o.a.s.c.s.i.ConcurrentUpdateHttp2SolrClient Error
> consuming and closing http response stream. =>
> java.nio.channels.AsynchronousCloseException
> at
> org.eclipse.jetty.client.util.InputStreamResponseListener$Input.read(InputStreamResponseListener.java:316)
> java.nio.channels.AsynchronousCloseException: null
> at
> org.eclipse.jetty.client.util.InputStreamResponseListener$Input.read(InputStreamResponseListener.java:316)
> at java.io.InputStream.read(InputStream.java:205) ~[?:?]
> at
> org.eclipse.jetty.client.util.InputStreamResponseListener$Input.read(InputStreamResponseListener.java:287)
> at
> org.apache.solr.client.solrj.impl.ConcurrentUpdateHttp2SolrClient$Runner.sendUpdateStream(ConcurrentUpdateHttp2SolrClient.java:283)
> at
> org.apache.solr.client.solrj.impl.ConcurrentUpdateHttp2SolrClient$Runner.run(ConcurrentUpdateHttp2SolrClient.java:176)
> at
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:181)
> at
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:209)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> ~[?:?]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> ~[?:?]
> at java.lang.Thread.run(Thread.java:834) [?:?]
> 
> My understanding is following this error, each server restart ended in the
> replica on this server being in "down" state, but I'm not sure how to
> confirm that.
> We then entered in a loop where term is increased because of failed
> replication.
> 
> Is this a know issue? I found no similar ticket in Jira.
> Could you please having a better understanding of the issue?
> Thanks



Re: [EXTERNAL] Getting rid of Master/Slave nomenclature in Solr

2020-06-19 Thread Phill Campbell
The entire idea of removing a word out of our language is problematic.
There will have to be a lot of history books that detail the terrible 
conditions of peoples over recorded history changed, or removed.

I find the “F” word extremely offensive. I find references to Deity while 
cursing extremely offensive. It is my privilege to deal with offense for the 
sake of liberty.

The use of the word is not promoting the practice nor is it denigrating those 
that have that in their history.

The “world” has decided, what ever.

Delegator - Handler

A common pattern we are all aware of. Pretty simple.



> On Jun 19, 2020, at 8:21 AM, Ilan Ginzburg  wrote:
> 
> +1 to Jan's "clustered" vs "non clustered".
> 
> If we clean up terminology, I suggest we also clarify the meaning and use
> of Slice vs Shard vs Leader vs Replica vs Core. Here's my understanding:
> 
> I consider Slice == Shard (and would happily drop Slice): a logical concept
> of a specific subset of a collection.
> A Shard then has one or multiple copies of the data called Replicas (if a
> shard has no copy of the data there's an issue). The Leader is one such
> Replica. A shard with a replication factor of 1 has a single Replica that
> happens to be the Leader. "Replica" does therefore not imply "replication".
> A Core is an in memory instantiation of a disk index representing a
> Replica. I believe that often the on disk index is referred to as "Core" as
> well (I'm not bothered by this, there's no associated confusion IMO).
> 
> Overseer is a central place where a fair bit of the cluster management
> logic is implemented today (Collection API, Autoscaling, Cluster state
> change). It is therefore a cluster manager. Note that a different
> implementation of "Clustered Solr" (a.k.a. SolrCloud) can most likely be
> done without the need of a central process in addition to the already
> centralized storage backend (currently ZooKeeper). In other words, Overseer
> is not IMO the defining characteristic of SolrCloud, it is one
> implementation choice, and there are others. To keep in mind for clarity
> and to guide renaming.
> 
> On Fri, Jun 19, 2020 at 3:23 PM j.s.  wrote:
> 
>> hi
>> 
>> solr is very helpful.
>> 
>> On 6/18/20 9:50 PM, Rahul Goswami wrote:
>>> So +1 on "slave" being the problematic term IMO, not "master".
>> 
>> but you cannot have a master without a slave, n'est-ce pas?
>> 
>> i think it is better to use the metaphor of copying rather than one of
>> hierarchy. language has so many (unintended) consequences ...
>> 
>> good luck!
>> 



Re: [EXTERNAL] Getting rid of Master/Slave nomenclature in Solr

2020-06-18 Thread Phill Campbell
Master - Worker
Master - Peon
Master - Helper
Master - Servant

The term that is not wanted is “slave’. The term “master” is not a problem IMO.

> On Jun 18, 2020, at 3:59 PM, Jan Høydahl  wrote:
> 
> I support Mike Drob and Trey Grainger. We shuold re-use the leader/replica
> terminology from Cloud. Even if you hand-configure a master/slave cluster
> and orchestrate what doc goes to which node/shard, and hand-code your shards
> parameter, you will still have a cluster where you’d send updates to the 
> leader of 
> each shard and the replicas would replicate the index from the leader.
> 
> Let’s instead find a new good name for the cluster type. Standalone kind of 
> works
> for me, but I see it can be confused with single-node. We have also discussed
> replacing SolrCloud (which is a terrible name) with something more 
> descriptive.
> 
> Today: SolrCloud vs Master/slave
> Alt A: SolrCloud vs Standalone
> Alt B: SolrCloud vs Legacy
> Alt C: Clustered vs Independent
> Alt D: Clustered vs Manual mode
> 
> Jan
> 
>> 18. jun. 2020 kl. 15:53 skrev Mike Drob :
>> 
>> I personally think that using Solr cloud terminology for this would be fine
>> with leader/follower. The leader is the one that accepts updates, followers
>> cascade the updates somehow. The presence of ZK or election doesn’t really
>> change this detail.
>> 
>> However, if folks feel that it’s confusing, then I can’t tell them that
>> they’re not confused. Especially when they’re working with others who have
>> less Solr experience than we do and are less familiar with the intricacies.
>> 
>> Primary/Replica seems acceptable. Coordinator instead of Overseer seems
>> acceptable.
>> 
>> Would love to see this in 9.0!
>> 
>> Mike
>> 
>> On Thu, Jun 18, 2020 at 8:25 AM John Gallagher
>>  wrote:
>> 
>>> While on the topic of renaming roles, I'd like to propose finding a better
>>> term than "overseer" which has historical slavery connotations as well.
>>> Director, perhaps?
>>> 
>>> 
>>> John Gallagher
>>> 
>>> On Thu, Jun 18, 2020 at 8:48 AM Jason Gerlowski 
>>> wrote:
>>> 
 +1 to rename master/slave, and +1 to choosing terminology distinct
 from what's used for SolrCloud.  I could be happy with several of the
 proposed options.  Since a good few have been proposed though, maybe
 an eventual vote thread is the most organized way to aggregate the
 opinions here.
 
 I'm less positive about the prospect of changing the name of our
 primary git branch.  Most projects that contributors might come from,
 most tutorials out there to learn git, most tools built on top of git
 - the majority are going to assume "master" as the main branch.  I
 appreciate the change that Github is trying to effect in changing the
 default for new projects, but it'll be a long time before that
 competes with the huge bulk of projects, documentation, etc. out there
 using "master".  Our contributors are smart and I'm sure they'd figure
 it out if we used "main" or something else instead, but having a
 non-standard git setup would be one more "papercut" in understanding
 how to contribute to a project that already makes that harder than it
 should.
 
 Jason
 
 
 On Thu, Jun 18, 2020 at 7:33 AM Demian Katz 
 wrote:
> 
> Regarding people having a problem with the word "master" -- GitHub is
 changing the default branch name away from "master," even in isolation
>>> from
 a "slave" pairing... so the terminology seems to be falling out of favor
>>> in
 all contexts. See:
> 
> 
 
>>> https://www.cnet.com/news/microsofts-github-is-removing-coding-terms-like-master-and-slave/
> 
> I'm not here to start a debate about the semantics of that, just to
 provide evidence that in some communities, the term "master" is causing
 concern all by itself. If we're going to make the change anyway, it might
 be best to get it over with and pick the most appropriate terminology we
 can agree upon, rather than trying to minimize the amount of change. It's
 going to be backward breaking anyway, so we might as well do it all now
 rather than risk having to go through two separate breaking changes at
 different points in time.
> 
> - Demian
> 
> -Original Message-
> From: Noble Paul 
> Sent: Thursday, June 18, 2020 1:51 AM
> To: solr-user@lucene.apache.org
> Subject: [EXTERNAL] Re: Getting rid of Master/Slave nomenclature in
>>> Solr
> 
> Looking at the code I see a 692 occurrences of the word "slave".
> Mostly variable names and ref guide docs.
> 
> The word "slave" is present in the responses as well. Any change in the
 request param/response payload is backward incompatible.
> 
> I have no objection to changing the names in ref guide and other
 internal variables. Going ahead with backward incompatible changes is
 painful. If somebody has the appetite to take it 

Re: Does 8.5.2 depend on 8.2.0

2020-06-18 Thread Phill Campbell
compile(group: 'org.springframework.boot',name: 
'spring-boot-starter-web',version: '2.2.6.RELEASE')
compile(group: 'javax.inject', name: 'javax.inject', version:'1')
compile(group: 'javax.ws.rs', name: 'javax.ws.rs-api', version: '2.1.1')
compile group: 'ch.qos.logback', name: 'logback-core', version: '1.2.3'
compile group: 'com.newrelic.agent.java', name: 'newrelic-api', version: 
'5.11.0'
compile group: 'org.springdoc', name: 'springdoc-openapi-ui', version: '1.3.1'
compile group: 'org.springdoc', name: 'springdoc-openapi-webmvc-core', version: 
‘1.3.1'


I am starting to suspect spring framework.



> On Jun 18, 2020, at 12:09 PM, Chris Hostetter  
> wrote:
> 
> 
> : Subject: Does 8.5.2 depend on 8.2.0
> 
> No.  The code certainly doesn't, but i suppose it's possible some metadata 
> somewhere in some pom file may be broken? 
> 
> 
> : My build.gradle has this:
> : compile(group: 'org.apache.solr', name: 'solr-solrj', version:'8.5.2')
> : No where is there a reference to 8.2.0
> 
> it sounds like you are using transitive dependencies (otherwise it 
> wouldn't make sense for you to wonder if 8.5.2 depends on 8.2.0) ... is it 
> posisble some *other* library you are depending on is depending on 8.2.0 
> directly? what does your dependency tree look like?
> 
> https://docs.gradle.org/current/userguide/viewing_debugging_dependencies.html
> 
> 
> -Hoss
> http://www.lucidworks.com/



Does 8.5.2 depend on 8.2.0

2020-06-18 Thread Phill Campbell
I use gradle to build my project. I noticed that in the build the jar it is 
using is 8.2.0. I don’t include that anywhere.
I clear my grade cache and rebuild and I see this:

$find . -name "solr*"
./modules-2/metadata-2.71/descriptors/org.apache.solr/solr-parent
./modules-2/metadata-2.71/descriptors/org.apache.solr/solr-solrj
./modules-2/files-2.1/org.apache.solr/solr-parent
./modules-2/files-2.1/org.apache.solr/solr-parent/8.2.0/347744884a53d0be51aa03dcff13407dc1bbabc3/solr-parent-8.2.0.pom
./modules-2/files-2.1/org.apache.solr/solr-parent/8.5.2/28b292f323144e3c059af8f9dbb5527ef2fb858f/solr-parent-8.5.2.pom
./modules-2/files-2.1/org.apache.solr/solr-solrj
./modules-2/files-2.1/org.apache.solr/solr-solrj/8.2.0/dfe052f181213504c6546f25c0185078886148e8/solr-solrj-8.2.0.pom
./modules-2/files-2.1/org.apache.solr/solr-solrj/8.2.0/5c466f157adf03428765c6e15a3d85a08f540a05/solr-solrj-8.2.0.jar
./modules-2/files-2.1/org.apache.solr/solr-solrj/8.2.0/c247802c64bcb485b637673182061c181393d54e/solr-solrj-8.2.0-sources.jar
./modules-2/files-2.1/org.apache.solr/solr-solrj/8.5.2/35ab7fc3b6bd51440c62edd9ba1a780eec005759/solr-solrj-8.5.2-sources.jar
./modules-2/files-2.1/org.apache.solr/solr-solrj/8.5.2/2cf6f0fccbecf820c62e4e1474073f4216c70356/solr-solrj-8.5.2.jar
./modules-2/files-2.1/org.apache.solr/solr-solrj/8.5.2/43a8ff25b47ca7a230185000a46bf0897aad2e93/solr-solrj-8.5.2.pom


My build.gradle has this:
compile(group: 'org.apache.solr', name: 'solr-solrj', version:'8.5.2')
No where is there a reference to 8.2.0

Any ideas on how this could happen?

I am using the latest Intellij version and Gradle 5.3




Re: Periodically 100% cpu and high load/IO

2020-06-07 Thread Phill Campbell
Can you switch to 8.5.2 and see if it still happens.
In my testing of 8.5.1 I had one of my machines get really hot and bring the 
entire system to a crawl.
What seemed to cause my issue was memory usage. I could give the JVM running 
Solr less heap and the problem wouldn’t manifest.
I haven’t seen it with 8.5.2. Just a thought.

> On Jun 3, 2020, at 8:27 AM, Marvin Bredal Lillehaug 
>  wrote:
> 
> Yes, there are light/moderate indexing most of the time.
> The setup has NRT replicas. And the shards are around 45GB each.
> Index merging has been the hypothesis for some time, but we haven't dared
> to activate info stream logging.
> 
> On Wed, Jun 3, 2020 at 2:34 PM Erick Erickson 
> wrote:
> 
>> One possibility is merging index segments. When this happens, are you
>> actively indexing? And are these NRT replicas or TLOG/PULL? If the latter,
>> are your TLOG leaders on the affected machines?
>> 
>> Best,
>> Erick
>> 
>>> On Jun 3, 2020, at 3:57 AM, Marvin Bredal Lillehaug <
>> marvin.lilleh...@gmail.com> wrote:
>>> 
>>> Hi,
>>> We have a cluster with five Solr(8.5.1, Java 11) nodes, and sometimes one
>>> or two nodes has Solr running with 100% cpu on all cores, «load» over
>> 400,
>>> and high IO. It usually lasts five to ten minutes, and the node is hardly
>>> responding.
>>> Does anyone have any experience with this type of behaviour? Is there any
>>> logging other than infostream that could give any information?
>>> 
>>> We managed to trigger a thread dump,
>>> 
 java.base@11.0.6
 
>> /java.nio.channels.spi.AbstractInterruptibleChannel.close(AbstractInterruptibleChannel.java:112)
 org.apache.lucene.util.IOUtils.fsync(IOUtils.java:483)
 org.apache.lucene.store.FSDirectory.fsync(FSDirectory.java:331)
 org.apache.lucene.store.FSDirectory.sync(FSDirectory.java:286)
 
 
>> org.apache.lucene.store.NRTCachingDirectory.sync(NRTCachingDirectory.java:158)
 
 
>> org.apache.lucene.store.LockValidatingDirectoryWrapper.sync(LockValidatingDirectoryWrapper.java:68)
 org.apache.lucene.index.IndexWriter.startCommit(IndexWriter.java:4805)
 
 
>> org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:3277)
 
>> org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3445)
 org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3410)
 
 
>> org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:678)
 
 
>> org.apache.solr.cloud.RecoveryStrategy.doSyncOrReplicateRecovery(RecoveryStrategy.java:636)
 
 
>> org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:337)
 org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:318)
>>> 
>>> 
>>> But not sure if this is from the incident or just right after. It seems
>>> strange that a fsync should behave like this.
>>> 
>>> Swappiness is set to default for RHEL 7 (Ops have resisted turning it
>> off)
>>> 
>>> --
>>> Kind regards,
>>> Marvin B. Lillehaug
>> 
>> 
> 
> -- 
> med vennlig hilsen,
> Marvin B. Lillehaug



Re: Need help on handling large size of index.

2020-05-22 Thread Phill Campbell
Maybe your problems are in AWS land.


> On May 22, 2020, at 3:45 AM, Modassar Ather  wrote:
> 
> Thanks Erick and Phill.
> 
> We index data weekly once and that is why we do the optimisation and it has
> helped in faster query result. I will experiment with a fewer segments with
> the current hardware.
> The thing I am not  clear about is although there is no constant high usage
> of extra IOPs other than a couple of spike during optimisation why there is
> so much difference in optimisation time when there is extra IOPs vs no
> Extra IOPs.
> The optimisation on different datacenter machine which was of same
> configuration with SSD used to take 4-5 hours to optimise. This time to
> optimise is comparable to r5a.16xlarge with extra 3 IOPs time.
> 
> Best,
> Modassar
> 
> On Fri, May 22, 2020 at 12:56 AM Phill Campbell
>  wrote:
> 
>> The optimal size for a shard of the index is be definition what works best
>> on the hardware with the JVM heap that is in use.
>> More shards mean smaller sizes of the index for the shard as you already
>> know.
>> 
>> I spent months changing the sharing, the JVM heap, the GC values before
>> taking the system live.
>> RAM is important, and I run with enough to allow Solr to load the entire
>> index into RAM. From my understanding Solr uses the system to memory map
>> the index files. I might be wrong.
>> I experimented with less RAM and SSD drives and found that was another way
>> to get the performance I needed. Since RAM is cheaper, I choose that
>> approach.
>> 
>> Again we never optimize. When we have to recover we rebuild the index by
>> spinning up new machines and use a massive EMR (Map reduce job) to force
>> the data into the system. Takes about 3 hours. Solr can ingest data at an
>> amazing rate. Then we do a blue/green switch over.
>> 
>> Query time, from my experience with my environment, is improved with more
>> sharding and additional hardware. Not just more sharding on the same
>> hardware.
>> 
>> My fields are not stored either, except ID. There are some fields that are
>> indexed and have DocValues and those are used for sorting and facets. My
>> queries can have any number of wildcards as well, but my field’s data
>> lengths are maybe a maximum of 100 characters so proximity searching is not
>> too bad. I tokenize and index everything. I do not expand terms at query
>> time to get broader results, I index the alternatives and let the indexer
>> do what it does best.
>> 
>> If you are running in SolrCloud mode and you are using the embedded
>> zookeeper I would change that. Solr and ZK are very chatty with each other,
>> run ZK on machines in proximity to Solr.
>> 
>> Regards
>> 
>>> On May 21, 2020, at 2:46 AM, Modassar Ather 
>> wrote:
>>> 
>>> Thanks Phill for your response.
>>> 
>>> Optimal Index size: Depends on what you are optimizing for. Query Speed?
>>> Hardware utilization?
>>> We are optimising it for query speed. What I understand even if we set
>> the
>>> merge policy to any number the amount of hard disk will still be required
>>> for the bigger segment merges. Please correct me if I am wrong.
>>> 
>>> Optimizing the index is something I never do. We live with about 28%
>>> deletes. You should check your configuration for your merge policy.
>>> There is a delete of about 10-20% in our updates. We have no merge policy
>>> set in configuration as we do a full optimisation after the indexing.
>>> 
>>> Increased sharding has helped reduce query response time, but surely
>> there
>>> is a point where the colation of results starts to be the bottleneck.
>>> The query response time is my concern. I understand the aggregation of
>>> results may increase the search response time.
>>> 
>>> *What does your schema look like? I index around 120 fields per
>> document.*
>>> The schema has a combination of text and string fields. None of the field
>>> except Id field is stored. We also have around 120 fields. A few of them
>>> have docValues enabled.
>>> 
>>> *What does your queries look like? Mine are so varied that caching never
>>> helps, the same query rarely comes through.*
>>> Our search queries are combination of proximity, nested proximity and
>>> wildcards most of the time. The query can be very complex with 100s of
>>> wildcard and proximity terms in it. Different grouping option are also
>>> enabled on search result. And the search queries vary a lot.
>>&

Re: Need help on handling large size of index.

2020-05-21 Thread Phill Campbell
The optimal size for a shard of the index is be definition what works best on 
the hardware with the JVM heap that is in use.
More shards mean smaller sizes of the index for the shard as you already know. 

I spent months changing the sharing, the JVM heap, the GC values before taking 
the system live.
RAM is important, and I run with enough to allow Solr to load the entire index 
into RAM. From my understanding Solr uses the system to memory map the index 
files. I might be wrong.
I experimented with less RAM and SSD drives and found that was another way to 
get the performance I needed. Since RAM is cheaper, I choose that approach.

Again we never optimize. When we have to recover we rebuild the index by 
spinning up new machines and use a massive EMR (Map reduce job) to force the 
data into the system. Takes about 3 hours. Solr can ingest data at an amazing 
rate. Then we do a blue/green switch over.

Query time, from my experience with my environment, is improved with more 
sharding and additional hardware. Not just more sharding on the same hardware.

My fields are not stored either, except ID. There are some fields that are 
indexed and have DocValues and those are used for sorting and facets. My 
queries can have any number of wildcards as well, but my field’s data lengths 
are maybe a maximum of 100 characters so proximity searching is not too bad. I 
tokenize and index everything. I do not expand terms at query time to get 
broader results, I index the alternatives and let the indexer do what it does 
best.

If you are running in SolrCloud mode and you are using the embedded zookeeper I 
would change that. Solr and ZK are very chatty with each other, run ZK on 
machines in proximity to Solr.

Regards

> On May 21, 2020, at 2:46 AM, Modassar Ather  wrote:
> 
> Thanks Phill for your response.
> 
> Optimal Index size: Depends on what you are optimizing for. Query Speed?
> Hardware utilization?
> We are optimising it for query speed. What I understand even if we set the
> merge policy to any number the amount of hard disk will still be required
> for the bigger segment merges. Please correct me if I am wrong.
> 
> Optimizing the index is something I never do. We live with about 28%
> deletes. You should check your configuration for your merge policy.
> There is a delete of about 10-20% in our updates. We have no merge policy
> set in configuration as we do a full optimisation after the indexing.
> 
> Increased sharding has helped reduce query response time, but surely there
> is a point where the colation of results starts to be the bottleneck.
> The query response time is my concern. I understand the aggregation of
> results may increase the search response time.
> 
> *What does your schema look like? I index around 120 fields per document.*
> The schema has a combination of text and string fields. None of the field
> except Id field is stored. We also have around 120 fields. A few of them
> have docValues enabled.
> 
> *What does your queries look like? Mine are so varied that caching never
> helps, the same query rarely comes through.*
> Our search queries are combination of proximity, nested proximity and
> wildcards most of the time. The query can be very complex with 100s of
> wildcard and proximity terms in it. Different grouping option are also
> enabled on search result. And the search queries vary a lot.
> 
> Oh, another thing, are you concerned about  availability? Do you have a
> replication factor > 1? Do you run those replicas in a different region for
> safety?
> How many zookeepers are you running and where are they?
> As of now we do not have any replication factor. We are not using zookeeper
> ensemble but would like to move to it sooner.
> 
> Best,
> Modassar
> 
> On Thu, May 21, 2020 at 9:19 AM Shawn Heisey  wrote:
> 
>> On 5/20/2020 11:43 AM, Modassar Ather wrote:
>>> Can you please help me with following few questions?
>>> 
>>>- What is the ideal index size per shard?
>> 
>> We have no way of knowing that.  A size that works well for one index
>> use case may not work well for another, even if the index size in both
>> cases is identical.  Determining the ideal shard size requires
>> experimentation.
>> 
>> 
>> https://lucidworks.com/post/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/
>> 
>>>- The optimisation takes lot of time and IOPs to complete. Will
>>>increasing the number of shards help in reducing the optimisation
>> time and
>>>IOPs?
>> 
>> No, changing the number of shards will not help with the time required
>> to optimize, and might make it slower.  Increasing the speed of the
>> disks won't help either.  Optimizing involves a lot more than just
>> copying data -- it will never use all the available disk bandwidth of
>> modern disks.  SolrCloud does optimizes of the shard replicas making up
>> a full collection sequentially, not simultaneously.
>> 
>>>- We are planning to reduce each shard index size to 30GB and the
>> 

Re: Unbalanced shard requests

2020-05-21 Thread Phill Campbell
Yes, JVM heap settings.

> On May 19, 2020, at 10:59 AM, Wei  wrote:
> 
> Hi Phill,
> 
> What is the RAM config you are referring to, JVM size? How is that related
> to the load balancing, if each node has the same configuration?
> 
> Thanks,
> Wei
> 
> On Mon, May 18, 2020 at 3:07 PM Phill Campbell
>  wrote:
> 
>> In my previous report I was configured to use as much RAM as possible.
>> With that configuration it seemed it was not load balancing.
>> So, I reconfigured and redeployed to use 1/4 the RAM. What a difference
>> for the better!
>> 
>> 10.156.112.50   load average: 13.52, 10.56, 6.46
>> 10.156.116.34   load average: 11.23, 12.35, 9.63
>> 10.156.122.13   load average: 10.29, 12.40, 9.69
>> 
>> Very nice.
>> My tool that tests records RPS. In the “bad” configuration it was less
>> than 1 RPS.
>> NOW it is showing 21 RPS.
>> 
>> 
>> http://10.156.112.50:10002/solr/admin/metrics?group=core=QUERY./select.requestTimes
>> <
>> http://10.156.112.50:10002/solr/admin/metrics?group=core=QUERY./select.requestTimes
>>> 
>> {
>>  "responseHeader":{
>>"status":0,
>>"QTime":161},
>>  "metrics":{
>>"solr.core.BTS.shard1.replica_n2":{
>>  "QUERY./select.requestTimes":{
>>"count":5723,
>>"meanRate":6.8163888639859085,
>>"1minRate":11.557013215119536,
>>"5minRate":8.760356217628159,
>>"15minRate":4.707624230995833,
>>"min_ms":0.131545,
>>"max_ms":388.710848,
>>"mean_ms":30.300492048215947,
>>"median_ms":6.336654,
>>"stddev_ms":51.527164088667035,
>>"p75_ms":35.427943,
>>"p95_ms":140.025957,
>>"p99_ms":230.533099,
>>"p999_ms":388.710848
>> 
>> 
>> 
>> http://10.156.122.13:10004/solr/admin/metrics?group=core=QUERY./select.requestTimes
>> <
>> http://10.156.122.13:10004/solr/admin/metrics?group=core=QUERY./select.requestTimes
>>> 
>> {
>>  "responseHeader":{
>>"status":0,
>>"QTime":11},
>>  "metrics":{
>>"solr.core.BTS.shard2.replica_n8":{
>>  "QUERY./select.requestTimes":{
>>"count":6469,
>>"meanRate":7.502581801189549,
>>"1minRate":12.211423085368564,
>>"5minRate":9.445681397767322,
>>"15minRate":5.216209798637846,
>>"min_ms":0.154691,
>>"max_ms":701.657394,
>>"mean_ms":34.2734699171445,
>>"median_ms":5.640378,
>>"stddev_ms":62.27649205954566,
>>"p75_ms":39.016371,
>>"p95_ms":156.997982,
>>"p99_ms":288.883028,
>>"p999_ms":538.368031
>> 
>> 
>> http://10.156.116.34:10002/solr/admin/metrics?group=core=QUERY./select.requestTimes
>> <
>> http://10.156.116.34:10002/solr/admin/metrics?group=core=QUERY./select.requestTimes
>>> 
>> {
>>  "responseHeader":{
>>"status":0,
>>"QTime":67},
>>  "metrics":{
>>"solr.core.BTS.shard3.replica_n16":{
>>  "QUERY./select.requestTimes":{
>>"count":7109,
>>"meanRate":7.787524673806184,
>>"1minRate":11.88519763582083,
>>"5minRate":9.893315557386755,
>>"15minRate":5.620178363676527,
>>"min_ms":0.150887,
>>"max_ms":472.826462,
>>"mean_ms":32.184282366621204,
>>"median_ms":6.977733,
>>"stddev_ms":55.729908615189196,
>>"p75_ms":36.655011,
>>"p95_ms":151.12627,
>>"p99_ms":251.440162,
>>"p999_ms":472.826462
>> 
>> 
>> Compare that to the previous report and you can see the improvement.
>> So, note to myself. Figure out the sweet spot for RAM usage. Use too much
>> and strange behavior is noticed. While using too much all the load focused
>> on one box and query times slowed.
>> I did not see any OOM errors during any of this.
>> 
>> Regards
>>

Re: Need help on handling large size of index.

2020-05-20 Thread Phill Campbell
In my world your index size is common.

Optimal Index size: Depends on what you are optimizing for. Query Speed? 
Hardware utilization? 
Optimizing the index is something I never do. We live with about 28% deletes. 
You should check your configuration for your merge policy.
I run 120 shards, and I am currently redesigning for 256 shards.
Increased sharding has helped reduce query response time, but surely there is a 
point where the colation of results starts to be the bottleneck.
I run the 120 shards on 90 r4.4xlarge instances with a replication factor of 3.

The things missing are:
What does your schema look like? I index around 120 fields per document.
What does your queries look like? Mine are so varied that caching never helps, 
the same query rarely comes through.
My system takes continuous updates, yours does not.

It is really up to you to experiment.

If you follow the development pattern of Design By Use (DBU) the first thing 
you do for solr and even for SQL is to come up with your queries first. Then 
design the schema. Then figure out how to distribute it for performance.

Oh, another thing, are you concerned about  availability? Do you have a 
replication factor > 1? Do you run those replicas in a different region for 
safety?
How many zookeepers are you running and where are they?

Lots of questions.

Regards

> On May 20, 2020, at 11:43 AM, Modassar Ather  wrote:
> 
> Hi,
> 
> Currently we have index of size 3.5 TB. These index are distributed across
> 12 shards under two cores. The size of index on each shards are almost
> equal.
> We do a delta indexing every week and optimise the index.
> 
> The server configuration is as follows.
> 
>   - Solr Version  : 6.5.1
>   - AWS instance type : r5a.16xlarge
>   - CPU(s)  : 64
>   - RAM  : 512GB
>   - EBS size  : 7 TB (For indexing as well as index optimisation.)
>   - IOPs  : 3 (For faster index optimisation)
> 
> 
> Can you please help me with following few questions?
> 
>   - What is the ideal index size per shard?
>   - The optimisation takes lot of time and IOPs to complete. Will
>   increasing the number of shards help in reducing the optimisation time and
>   IOPs?
>   - We are planning to reduce each shard index size to 30GB and the entire
>   3.5 TB index will be distributed across more shards. In this case to almost
>   70+ shards. Will this help?
>   - Will adding so many new shards increase the search response time and
>   possibly how much?
>   - If we have to increase the shards should we do it on a single larger
>   server or should do it on multiple small servers?
> 
> 
> Kindly share your thoughts on how best we can use Solr with such a large
> index size.
> 
> Best,
> Modassar



Re: Need help on handling large size of index.

2020-05-20 Thread Phill Campbell
In my world your index size is common.

Optimal Index size: Depends on what you are optimizing for. Query Speed? 
Hardware utilization? 
Optimizing the index is something I never do. We live with about 28% deletes. 
You should check your configuration for your merge policy.
I run 120 shards, and I am currently redesigning for 256 shards.
Increased sharding has helped reduce query response time, but surely there is a 
point where the colation of results starts to be the bottleneck.
I run the 120 shards on 90 r4.4xlarge instances with a replication factor of 3.

The things missing are:
What does your schema look like? I index around 120 fields per document.
What does your queries look like? Mine are so varied that caching never helps, 
the same query rarely comes through.
My system takes continuous updates, yours does not.

It is really up to you to experiment.

If you follow the development pattern of Design By Use (DBU) the first thing 
you do for solr and even for SQL is to come up with your queries first. Then 
design the schema. Then figure out how to distribute it for performance.

Oh, another thing, are you concerned about  availability? Do you have a 
replication factor > 1? Do you run those replicas in a different region for 
safety?
How many zookeepers are you running and where are they?

Lots of questions.

Regards

> On May 20, 2020, at 11:43 AM, Modassar Ather  wrote:
> 
> Hi,
> 
> Currently we have index of size 3.5 TB. These index are distributed across
> 12 shards under two cores. The size of index on each shards are almost
> equal.
> We do a delta indexing every week and optimise the index.
> 
> The server configuration is as follows.
> 
>  - Solr Version  : 6.5.1
>  - AWS instance type : r5a.16xlarge
>  - CPU(s)  : 64
>  - RAM  : 512GB
>  - EBS size  : 7 TB (For indexing as well as index optimisation.)
>  - IOPs  : 3 (For faster index optimisation)
> 
> 
> Can you please help me with following few questions?
> 
>  - What is the ideal index size per shard?
>  - The optimisation takes lot of time and IOPs to complete. Will
>  increasing the number of shards help in reducing the optimisation time and
>  IOPs?
>  - We are planning to reduce each shard index size to 30GB and the entire
>  3.5 TB index will be distributed across more shards. In this case to almost
>  70+ shards. Will this help?
>  - Will adding so many new shards increase the search response time and
>  possibly how much?
>  - If we have to increase the shards should we do it on a single larger
>  server or should do it on multiple small servers?
> 
> 
> Kindly share your thoughts on how best we can use Solr with such a large
> index size.
> 
> Best,
> Modassar



Re: Unbalanced shard requests

2020-05-18 Thread Phill Campbell
In my previous report I was configured to use as much RAM as possible. With 
that configuration it seemed it was not load balancing.
So, I reconfigured and redeployed to use 1/4 the RAM. What a difference for the 
better!

10.156.112.50   load average: 13.52, 10.56, 6.46
10.156.116.34   load average: 11.23, 12.35, 9.63
10.156.122.13   load average: 10.29, 12.40, 9.69

Very nice.
My tool that tests records RPS. In the “bad” configuration it was less than 1 
RPS.
NOW it is showing 21 RPS.

http://10.156.112.50:10002/solr/admin/metrics?group=core=QUERY./select.requestTimes
 
<http://10.156.112.50:10002/solr/admin/metrics?group=core=QUERY./select.requestTimes>
{
  "responseHeader":{
"status":0,
"QTime":161},
  "metrics":{
"solr.core.BTS.shard1.replica_n2":{
  "QUERY./select.requestTimes":{
"count":5723,
"meanRate":6.8163888639859085,
"1minRate":11.557013215119536,
"5minRate":8.760356217628159,
"15minRate":4.707624230995833,
"min_ms":0.131545,
"max_ms":388.710848,
"mean_ms":30.300492048215947,
"median_ms":6.336654,
"stddev_ms":51.527164088667035,
"p75_ms":35.427943,
"p95_ms":140.025957,
"p99_ms":230.533099,
"p999_ms":388.710848


http://10.156.122.13:10004/solr/admin/metrics?group=core=QUERY./select.requestTimes
 
<http://10.156.122.13:10004/solr/admin/metrics?group=core=QUERY./select.requestTimes>
{
  "responseHeader":{
"status":0,
"QTime":11},
  "metrics":{
"solr.core.BTS.shard2.replica_n8":{
  "QUERY./select.requestTimes":{
"count":6469,
"meanRate":7.502581801189549,
"1minRate":12.211423085368564,
"5minRate":9.445681397767322,
"15minRate":5.216209798637846,
"min_ms":0.154691,
"max_ms":701.657394,
"mean_ms":34.2734699171445,
"median_ms":5.640378,
"stddev_ms":62.27649205954566,
"p75_ms":39.016371,
"p95_ms":156.997982,
"p99_ms":288.883028,
"p999_ms":538.368031

http://10.156.116.34:10002/solr/admin/metrics?group=core=QUERY./select.requestTimes
 
<http://10.156.116.34:10002/solr/admin/metrics?group=core=QUERY./select.requestTimes>
{
  "responseHeader":{
"status":0,
"QTime":67},
  "metrics":{
"solr.core.BTS.shard3.replica_n16":{
  "QUERY./select.requestTimes":{
"count":7109,
"meanRate":7.787524673806184,
"1minRate":11.88519763582083,
"5minRate":9.893315557386755,
"15minRate":5.620178363676527,
"min_ms":0.150887,
"max_ms":472.826462,
"mean_ms":32.184282366621204,
"median_ms":6.977733,
"stddev_ms":55.729908615189196,
"p75_ms":36.655011,
"p95_ms":151.12627,
"p99_ms":251.440162,
"p999_ms":472.826462


Compare that to the previous report and you can see the improvement.
So, note to myself. Figure out the sweet spot for RAM usage. Use too much and 
strange behavior is noticed. While using too much all the load focused on one 
box and query times slowed.
I did not see any OOM errors during any of this.

Regards



> On May 18, 2020, at 3:23 PM, Phill Campbell  
> wrote:
> 
> I have been testing 8.5.2 and it looks like the load has moved but is still 
> on one machine.
> 
> Setup:
> 3 physical machines.
> Each machine hosts 8 instances of Solr.
> Each instance of Solr hosts one replica.
> 
> Another way to say it:
> Number of shards = 8. Replication factor = 3.
> 
> Here is the cluster state. You can see that the leaders are well distributed. 
> 
> {"TEST_COLLECTION":{
>"pullReplicas":"0",
>"replicationFactor":"3",
>"shards":{
>  "shard1":{
>"range":"8000-9fff",
>"state":"active",
>"replicas":{
>  "core_node3":{
>"core":"TEST_COLLECTION_shard1_replica_n1",
>"base_url":"http://10.156.122.13:10007/solr;,
>"node_name":"10.156.122.13:10007_solr",
>"state":"active",
>"type":"NRT",
>"for

Re: Unbalanced shard requests

2020-05-18 Thread Phill Campbell
I have been testing 8.5.2 and it looks like the load has moved but is still on 
one machine.

Setup:
3 physical machines.
Each machine hosts 8 instances of Solr.
Each instance of Solr hosts one replica.

Another way to say it:
Number of shards = 8. Replication factor = 3.

Here is the cluster state. You can see that the leaders are well distributed. 

{"TEST_COLLECTION":{
"pullReplicas":"0",
"replicationFactor":"3",
"shards":{
  "shard1":{
"range":"8000-9fff",
"state":"active",
"replicas":{
  "core_node3":{
"core":"TEST_COLLECTION_shard1_replica_n1",
"base_url":"http://10.156.122.13:10007/solr;,
"node_name":"10.156.122.13:10007_solr",
"state":"active",
"type":"NRT",
"force_set_state":"false"},
  "core_node5":{
"core":"TEST_COLLECTION_shard1_replica_n2",
"base_url":"http://10.156.112.50:10002/solr;,
"node_name":"10.156.112.50:10002_solr",
"state":"active",
"type":"NRT",
"force_set_state":"false",
"leader":"true"},
  "core_node7":{
"core":"TEST_COLLECTION_shard1_replica_n4",
"base_url":"http://10.156.112.50:10006/solr;,
"node_name":"10.156.112.50:10006_solr",
"state":"active",
"type":"NRT",
"force_set_state":"false"}}},
  "shard2":{
"range":"a000-bfff",
"state":"active",
"replicas":{
  "core_node9":{
"core":"TEST_COLLECTION_shard2_replica_n6",
"base_url":"http://10.156.112.50:10003/solr;,
"node_name":"10.156.112.50:10003_solr",
"state":"active",
"type":"NRT",
"force_set_state":"false"},
  "core_node11":{
"core":"TEST_COLLECTION_shard2_replica_n8",
"base_url":"http://10.156.122.13:10004/solr;,
"node_name":"10.156.122.13:10004_solr",
"state":"active",
"type":"NRT",
"force_set_state":"false",
"leader":"true"},
  "core_node12":{
"core":"TEST_COLLECTION_shard2_replica_n10",
"base_url":"http://10.156.116.34:10008/solr;,
"node_name":"10.156.116.34:10008_solr",
"state":"active",
"type":"NRT",
"force_set_state":"false"}}},
  "shard3":{
"range":"c000-dfff",
"state":"active",
"replicas":{
  "core_node15":{
"core":"TEST_COLLECTION_shard3_replica_n13",
"base_url":"http://10.156.122.13:10008/solr;,
"node_name":"10.156.122.13:10008_solr",
"state":"active",
"type":"NRT",
"force_set_state":"false"},
  "core_node17":{
"core":"TEST_COLLECTION_shard3_replica_n14",
"base_url":"http://10.156.116.34:10005/solr;,
"node_name":"10.156.116.34:10005_solr",
"state":"active",
"type":"NRT",
"force_set_state":"false"},
  "core_node19":{
"core":"TEST_COLLECTION_shard3_replica_n16",
"base_url":"http://10.156.116.34:10002/solr;,
"node_name":"10.156.116.34:10002_solr",
"state":"active",
"type":"NRT",
"force_set_state":"false",
"leader":"true"}}},
  "shard4":{
"range":"e000-",
"state":"active",
"replicas":{
  "core_node20":{
"core":"TEST_COLLECTION_shard4_replica_n18",
"base_url":"http://10.156.122.13:10001/solr;,
"node_name":"10.156.122.13:10001_solr",
"state":"active",
"type":"NRT",
"force_set_state":"false"},
  "core_node23":{
"core":"TEST_COLLECTION_shard4_replica_n21",
"base_url":"http://10.156.116.34:10004/solr;,
"node_name":"10.156.116.34:10004_solr",
"state":"active",
"type":"NRT",
"force_set_state":"false"},
  "core_node25":{
"core":"TEST_COLLECTION_shard4_replica_n22",
"base_url":"http://10.156.112.50:10001/solr;,
"node_name":"10.156.112.50:10001_solr",
"state":"active",
"type":"NRT",
"force_set_state":"false",
"leader":"true"}}},
  "shard5":{
"range":"0-1fff",
"state":"active",
"replicas":{
  "core_node27":{
"core":"TEST_COLLECTION_shard5_replica_n24",
"base_url":"http://10.156.116.34:10007/solr;,
"node_name":"10.156.116.34:10007_solr",
"state":"active",
"type":"NRT",
"force_set_state":"false"},
  "core_node29":{
"core":"TEST_COLLECTION_shard5_replica_n26",
"base_url":"http://10.156.122.13:10006/solr;,

Re: Download a pre-release version? 8.6

2020-05-15 Thread Phill Campbell
Seems like that it would be a good idea to put this in 8.5.2. I was running 3 
machines, and the first machine would be running so hot that the response time 
went from 350ms to 12,000ms. I would kill that machine and the times would come 
back to normal. I would start it back up and it wasn’t long before it was 
running hot and the other two machines running low CPU.

It could really cause someone some grief if they didn’t find this out before 
going “live”.


> On May 15, 2020, at 3:07 PM, Mike Drob  wrote:
> 
> We could theoretically include this in a 8.5.2 version which should be
> released soon. The change looks minimally risky to backport?
> 
> On Fri, May 15, 2020 at 3:43 PM Jan Høydahl  wrote:
> 
>> Check Jenkins:
>> https://builds.apache.org/view/L/view/Lucene/job/Solr-Artifacts-8.x/lastSuccessfulBuild/artifact/solr/package/
>> 
>> Jan Høydahl
>> 
>>> 15. mai 2020 kl. 22:27 skrev Phill Campbell
>> :
>>> 
>>> Is there a way to download a tgz of the binary of a nightly build or
>> similar?
>>> 
>>> I have been testing 8.5.1 and ran into the bug with load balancing.
>>> https://issues.apache.org/jira/browse/SOLR-14471 <
>> https://issues.apache.org/jira/browse/SOLR-14471>
>>> 
>>> It is a deal breaker for me to move forward with an upgrade of the
>> system.
>>> 
>>> I would like to start evaluating a version that has the fix.
>>> 
>>> Is there a place to get a build?
>>> 
>>> Thank you.
>> 



Download a pre-release version? 8.6

2020-05-15 Thread Phill Campbell
Is there a way to download a tgz of the binary of a nightly build or similar?

I have been testing 8.5.1 and ran into the bug with load balancing. 
https://issues.apache.org/jira/browse/SOLR-14471 


It is a deal breaker for me to move forward with an upgrade of the system.

I would like to start evaluating a version that has the fix. 

Is there a place to get a build?

Thank you. 

Re: Solr 8.5.1 query timeAllowed exceeded throws exception

2020-05-12 Thread Phill Campbell
Upon examining the Solr source code it appears that it was unable to even make 
a connection in the time allowed.
While the error message was a bit confusing, I do understand what it means.


> On May 12, 2020, at 2:08 PM, Phill Campbell  
> wrote:
> 
> 
> 
> org.apache.solr.client.solrj.SolrServerException: Time allowed to handle this 
> request exceeded:…
>   at 
> org.apache.solr.client.solrj.impl.LBSolrClient.request(LBSolrClient.java:345)
>   at 
> org.apache.solr.client.solrj.impl.BaseCloudSolrClient.sendRequest(BaseCloudSolrClient.java:1143)
>   at 
> org.apache.solr.client.solrj.impl.BaseCloudSolrClient.requestWithRetryOnStaleState(BaseCloudSolrClient.java:906)
>   at 
> org.apache.solr.client.solrj.impl.BaseCloudSolrClient.request(BaseCloudSolrClient.java:838)
>   at 
> org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:211)
>   at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:1035)
> ...
>   at javax.swing.SwingWorker$1.call(SwingWorker.java:295)
>   at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:266)
>   at java.util.concurrent.FutureTask.run(FutureTask.java)
>   at javax.swing.SwingWorker.run(SwingWorker.java:334)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: 
> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
> from server at http://10.156.112.50:10001/solr/BTS: 
> java.lang.NullPointerException
> 
>   at 
> org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:665)
>   at 
> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:265)
>   at 
> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:248)
>   at 
> org.apache.solr.client.solrj.impl.LBSolrClient.doRequest(LBSolrClient.java:368)
>   at 
> org.apache.solr.client.solrj.impl.LBSolrClient.request(LBSolrClient.java:296)
> 
> 
> The timeAllowed is set to 8 seconds. I am using a StopWatch to verify that 
> the round trip was greater than 8 seconds.
> 
> Documentation states:
> 
> timeAllowed Parameter
> This parameter specifies the amount of time, in milliseconds, allowed for a 
> search to complete. If this time expires before the search is complete, any 
> partial results will be returned, but values such as numFound, facet counts, 
> and result stats may not be accurate for the entire result set. In case of 
> expiration, if omitHeader isn’t set to true the response header contains a 
> special flag called partialResults.
> 
> I do not believe I should be getting an exception.
> 
> I am load testing so I am intentionally putting pressure on the system.
> 
> Is this the correct behavior to throw an exception?
> 
> Regards.



Solr 8.5.1 query timeAllowed exceeded throws exception

2020-05-12 Thread Phill Campbell



org.apache.solr.client.solrj.SolrServerException: Time allowed to handle this 
request exceeded:…
at 
org.apache.solr.client.solrj.impl.LBSolrClient.request(LBSolrClient.java:345)
at 
org.apache.solr.client.solrj.impl.BaseCloudSolrClient.sendRequest(BaseCloudSolrClient.java:1143)
at 
org.apache.solr.client.solrj.impl.BaseCloudSolrClient.requestWithRetryOnStaleState(BaseCloudSolrClient.java:906)
at 
org.apache.solr.client.solrj.impl.BaseCloudSolrClient.request(BaseCloudSolrClient.java:838)
at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:211)
at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:1035)
...
at javax.swing.SwingWorker$1.call(SwingWorker.java:295)
at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:266)
at java.util.concurrent.FutureTask.run(FutureTask.java)
at javax.swing.SwingWorker.run(SwingWorker.java:334)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: 
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
from server at http://10.156.112.50:10001/solr/BTS: 
java.lang.NullPointerException

at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:665)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:265)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:248)
at 
org.apache.solr.client.solrj.impl.LBSolrClient.doRequest(LBSolrClient.java:368)
at 
org.apache.solr.client.solrj.impl.LBSolrClient.request(LBSolrClient.java:296)


The timeAllowed is set to 8 seconds. I am using a StopWatch to verify that the 
round trip was greater than 8 seconds.

Documentation states:

timeAllowed Parameter
This parameter specifies the amount of time, in milliseconds, allowed for a 
search to complete. If this time expires before the search is complete, any 
partial results will be returned, but values such as numFound, facet counts, 
and result stats may not be accurate for the entire result set. In case of 
expiration, if omitHeader isn’t set to true the response header contains a 
special flag called partialResults.

I do not believe I should be getting an exception.

I am load testing so I am intentionally putting pressure on the system.

Is this the correct behavior to throw an exception?

Regards.

Re: Solr 8.5.1 Using Port 10001 doesn't work in Dashboard

2020-05-04 Thread Phill Campbell
I installed PostMan and verified that the response from Solr is correct.
I cleared cached images and files for Chrome and the problem is solved.

> On May 1, 2020, at 3:42 PM, Sylvain James  wrote:
> 
> Hi Phil,
> 
> I encountered something similar recently, and after switched to Firefox,
> all urls were fine.
> May be a encoding side effect.
> It seems to me that a new solr ui is in development. May be this issue will
> be fixed for the release of this ui.
> 
> Sylvain
> 
> 
> Le ven. 1 mai 2020 à 22:52, Phill Campbell  <mailto:sirgilli...@yahoo.com.invalid>>
> a écrit :
> 
>> The browser is Chrome. I forgot to state that before.
>> That got me to thinking and so I ran it from Fire Fox.
>> Everything seems to be fine there!
>> 
>> Interesting. Since this is my development environment I do not run any
>> plugins on any of my browsers.
>> 
>>> On May 1, 2020, at 2:41 PM, Phill Campbell 
>> wrote:
>>> 
>>> Today I installed Solr 8.5.1 to replace an 8.2.0 installation.
>>> It is a clean install, not a migration, there was no data that I needed
>> to keep.
>>> 
>>> I run Solr (Solr Cloud Mode) on ports starting with 10001. I have been
>> doing this since Solr 5x releases.
>>> 
>>> In my experiment I have 1 shard with replication factor of 2.
>>> 
>>> http://10.xxx.xxx.xxx:10001/solr/#/ <http://10.xxx.xxx.xxx:10001/solr/#/
>>> 
>>> 
>>> http://10.xxx.xxx.xxx:10002/solr/#/ <http://10.xxx.xxx.xxx:10002/solr/#/
>>> 
>>> 
>>> If I go to the “10001” instance the URL changes and is messed up and no
>> matter which link in the dashboard I click it shows the same information.
>>> So, use Solr is running, the dashboard comes up.
>>> 
>>> The URL changes and looks like this:
>>> 
>>> http://10.xxx.xxx.xxx:10001/solr/#!/#%2F
>> <http://10.xxx.xxx.xxx:10001/solr/#!/%23%2F 
>> <http://10.xxx.xxx.xxx:10001/solr/#!/%23%2F>>
>>> 
>>> However, on port 10002 it stays like this and show the proper UI in the
>> dashboard:
>>> 
>>> http://10.xxx.xxx.xxx:10002/solr/#/ <http://10.xxx.xxx.xxx:10002/solr/#/
>>> 
>>> 
>>> To make sure something wasn’t interfering with port 10001 I re-installed
>> my previous Solr installation and it works fine.
>>> 
>>> What is this “#!” (Hash bang) stuff in the URL?
>>> How can I run on port 10001?
>>> 
>>> Probably something obvious, but I just can’t see it.
>>> 
>>> For every link from the dashboard:
>>> :10001/solr/#!/#%2F~logging
>>> :10001/solr/#!/#%2F~cloud
>>> :10001/solr/#!/#%2F~collections
>>> :10001/solr/#!/#%2F~java-properties
>>> :10001/solr/#!/#%2F~threads
>>> :10001/solr/#!/#%2F~cluster-suggestions
>>> 
>>> 
>>> 
>>> From “10002” I see everything fine.
>>> :10002/solr/#/~cloud
>>> 
>>> Shows the following:
>>> 
>>> Host
>>> 10.xxx.xxx.xxx
>>> Linux 3.10.0-1127.el7.x86_64, 2cpu
>>> Uptime: unknown
>>> Memory: 14.8Gb
>>> File descriptors: 180/100
>>> Disk: 49.1Gb used: 5%
>>> Load: 0
>>> 
>>> Node
>>> 10001_solr
>>> Uptime: 2h 10m
>>> Java 1.8.0_222
>>> Solr 8.5.1
>>> ---
>>> 10002_solr
>>> Uptime: 2h 9m
>>> Java 1.8.0_222
>>> Solr 8.5.1
>>> 
>>> 
>>> If I switch my starting port from 10001 to 10002 both instances work.
>> (10002, and 10003)
>>> If I switch my starting port from 10001 to 10101 both instances work.
>> (10101, and 10102)
>>> 
>>> Any help is appreciated.



Re: Solr 8.5.1 Using Port 10001 doesn't work in Dashboard

2020-05-01 Thread Phill Campbell
Unless someone knows something concrete, I am going to move forward and assume 
that it is Google Chrome.
Thank you Sylvain.

> On May 1, 2020, at 3:42 PM, Sylvain James  <mailto:sylvain.ja...@gmail.com>> wrote:
> 
> Hi Phil,
> 
> I encountered something similar recently, and after switched to Firefox,
> all urls were fine.
> May be a encoding side effect.
> It seems to me that a new solr ui is in development. May be this issue will
> be fixed for the release of this ui.
> 
> Sylvain
> 
> 
> Le ven. 1 mai 2020 à 22:52, Phill Campbell  <mailto:sirgilli...@yahoo.com.invalid>>
> a écrit :
> 
>> The browser is Chrome. I forgot to state that before.
>> That got me to thinking and so I ran it from Fire Fox.
>> Everything seems to be fine there!
>> 
>> Interesting. Since this is my development environment I do not run any
>> plugins on any of my browsers.
>> 
>>> On May 1, 2020, at 2:41 PM, Phill Campbell >> <mailto:sirgilli...@yahoo.com.INVALID>>
>> wrote:
>>> 
>>> Today I installed Solr 8.5.1 to replace an 8.2.0 installation.
>>> It is a clean install, not a migration, there was no data that I needed
>> to keep.
>>> 
>>> I run Solr (Solr Cloud Mode) on ports starting with 10001. I have been
>> doing this since Solr 5x releases.
>>> 
>>> In my experiment I have 1 shard with replication factor of 2.
>>> 
>>> http://10.xxx.xxx.xxx:10001/solr/#/ <http://10.xxx.xxx.xxx:10001/solr/#/> 
>>> <http://10.xxx.xxx.xxx:10001/solr/#/ <http://10.xxx.xxx.xxx:10001/solr/#/>
>>> 
>>> 
>>> http://10.xxx.xxx.xxx:10002/solr/#/ <http://10.xxx.xxx.xxx:10002/solr/#/> 
>>> <http://10.xxx.xxx.xxx:10002/solr/#/ <http://10.xxx.xxx.xxx:10002/solr/#/>
>>> 
>>> 
>>> If I go to the “10001” instance the URL changes and is messed up and no
>> matter which link in the dashboard I click it shows the same information.
>>> So, use Solr is running, the dashboard comes up.
>>> 
>>> The URL changes and looks like this:
>>> 
>>> http://10.xxx.xxx.xxx:10001/solr/#!/#%2F 
>>> <http://10.xxx.xxx.xxx:10001/solr/#!/#%2F>
>> <http://10.xxx.xxx.xxx:10001/solr/#!/%23%2F 
>> <http://10.xxx.xxx.xxx:10001/solr/#!/%23%2F>>
>>> 
>>> However, on port 10002 it stays like this and show the proper UI in the
>> dashboard:
>>> 
>>> http://10.xxx.xxx.xxx:10002/solr/#/ <http://10.xxx.xxx.xxx:10002/solr/#/> 
>>> <http://10.xxx.xxx.xxx:10002/solr/#/ <http://10.xxx.xxx.xxx:10002/solr/#/>
>>> 
>>> 
>>> To make sure something wasn’t interfering with port 10001 I re-installed
>> my previous Solr installation and it works fine.
>>> 
>>> What is this “#!” (Hash bang) stuff in the URL?
>>> How can I run on port 10001?
>>> 
>>> Probably something obvious, but I just can’t see it.
>>> 
>>> For every link from the dashboard:
>>> :10001/solr/#!/#%2F~logging
>>> :10001/solr/#!/#%2F~cloud
>>> :10001/solr/#!/#%2F~collections
>>> :10001/solr/#!/#%2F~java-properties
>>> :10001/solr/#!/#%2F~threads
>>> :10001/solr/#!/#%2F~cluster-suggestions
>>> 
>>> 
>>> 
>>> From “10002” I see everything fine.
>>> :10002/solr/#/~cloud
>>> 
>>> Shows the following:
>>> 
>>> Host
>>> 10.xxx.xxx.xxx
>>> Linux 3.10.0-1127.el7.x86_64, 2cpu
>>> Uptime: unknown
>>> Memory: 14.8Gb
>>> File descriptors: 180/100
>>> Disk: 49.1Gb used: 5%
>>> Load: 0
>>> 
>>> Node
>>> 10001_solr
>>> Uptime: 2h 10m
>>> Java 1.8.0_222
>>> Solr 8.5.1
>>> ---
>>> 10002_solr
>>> Uptime: 2h 9m
>>> Java 1.8.0_222
>>> Solr 8.5.1
>>> 
>>> 
>>> If I switch my starting port from 10001 to 10002 both instances work.
>> (10002, and 10003)
>>> If I switch my starting port from 10001 to 10101 both instances work.
>> (10101, and 10102)
>>> 
>>> Any help is appreciated.



Re: Solr 8.5.1 Using Port 10001 doesn't work in Dashboard

2020-05-01 Thread Phill Campbell
The browser is Chrome. I forgot to state that before.
That got me to thinking and so I ran it from Fire Fox.
Everything seems to be fine there! 

Interesting. Since this is my development environment I do not run any plugins 
on any of my browsers.

> On May 1, 2020, at 2:41 PM, Phill Campbell  
> wrote:
> 
> Today I installed Solr 8.5.1 to replace an 8.2.0 installation.
> It is a clean install, not a migration, there was no data that I needed to 
> keep.
> 
> I run Solr (Solr Cloud Mode) on ports starting with 10001. I have been doing 
> this since Solr 5x releases.
> 
> In my experiment I have 1 shard with replication factor of 2.
> 
> http://10.xxx.xxx.xxx:10001/solr/#/ <http://10.xxx.xxx.xxx:10001/solr/#/>
> 
> http://10.xxx.xxx.xxx:10002/solr/#/ <http://10.xxx.xxx.xxx:10002/solr/#/>
> 
> If I go to the “10001” instance the URL changes and is messed up and no 
> matter which link in the dashboard I click it shows the same information.
> So, use Solr is running, the dashboard comes up.
> 
> The URL changes and looks like this:
> 
> http://10.xxx.xxx.xxx:10001/solr/#!/#%2F
> 
> However, on port 10002 it stays like this and show the proper UI in the 
> dashboard:
> 
> http://10.xxx.xxx.xxx:10002/solr/#/ <http://10.xxx.xxx.xxx:10002/solr/#/>
> 
> To make sure something wasn’t interfering with port 10001 I re-installed my 
> previous Solr installation and it works fine.
> 
> What is this “#!” (Hash bang) stuff in the URL?
> How can I run on port 10001?
> 
> Probably something obvious, but I just can’t see it.
> 
> For every link from the dashboard:
> :10001/solr/#!/#%2F~logging
> :10001/solr/#!/#%2F~cloud
> :10001/solr/#!/#%2F~collections
> :10001/solr/#!/#%2F~java-properties
> :10001/solr/#!/#%2F~threads
> :10001/solr/#!/#%2F~cluster-suggestions
> 
> 
> 
> From “10002” I see everything fine.
> :10002/solr/#/~cloud
> 
> Shows the following:
> 
> Host
> 10.xxx.xxx.xxx
> Linux 3.10.0-1127.el7.x86_64, 2cpu
> Uptime: unknown
> Memory: 14.8Gb
> File descriptors: 180/100
> Disk: 49.1Gb used: 5%
> Load: 0
> 
> Node
> 10001_solr
> Uptime: 2h 10m
> Java 1.8.0_222
> Solr 8.5.1
> ---
> 10002_solr
> Uptime: 2h 9m
> Java 1.8.0_222
> Solr 8.5.1
> 
> 
> If I switch my starting port from 10001 to 10002 both instances work. (10002, 
> and 10003)
> If I switch my starting port from 10001 to 10101 both instances work. (10101, 
> and 10102)
> 
> Any help is appreciated.



Solr 8.5.1 Using Port 10001 doesn't work in Dashboard

2020-05-01 Thread Phill Campbell
Today I installed Solr 8.5.1 to replace an 8.2.0 installation.
It is a clean install, not a migration, there was no data that I needed to keep.

I run Solr (Solr Cloud Mode) on ports starting with 10001. I have been doing 
this since Solr 5x releases.

In my experiment I have 1 shard with replication factor of 2.

http://10.xxx.xxx.xxx:10001/solr/#/ 

http://10.xxx.xxx.xxx:10002/solr/#/ 

If I go to the “10001” instance the URL changes and is messed up and no matter 
which link in the dashboard I click it shows the same information.
So, use Solr is running, the dashboard comes up.

The URL changes and looks like this:

http://10.xxx.xxx.xxx:10001/solr/#!/#%2F

However, on port 10002 it stays like this and show the proper UI in the 
dashboard:

http://10.xxx.xxx.xxx:10002/solr/#/ 

To make sure something wasn’t interfering with port 10001 I re-installed my 
previous Solr installation and it works fine.

What is this “#!” (Hash bang) stuff in the URL?
How can I run on port 10001?

Probably something obvious, but I just can’t see it.

For every link from the dashboard:
:10001/solr/#!/#%2F~logging
:10001/solr/#!/#%2F~cloud
:10001/solr/#!/#%2F~collections
:10001/solr/#!/#%2F~java-properties
:10001/solr/#!/#%2F~threads
:10001/solr/#!/#%2F~cluster-suggestions



From “10002” I see everything fine.
:10002/solr/#/~cloud

Shows the following:

Host
10.xxx.xxx.xxx
Linux 3.10.0-1127.el7.x86_64, 2cpu
Uptime: unknown
Memory: 14.8Gb
File descriptors: 180/100
Disk: 49.1Gb used: 5%
Load: 0

Node
10001_solr
Uptime: 2h 10m
Java 1.8.0_222
Solr 8.5.1
---
10002_solr
Uptime: 2h 9m
Java 1.8.0_222
Solr 8.5.1


If I switch my starting port from 10001 to 10002 both instances work. (10002, 
and 10003)
If I switch my starting port from 10001 to 10101 both instances work. (10101, 
and 10102)

Any help is appreciated.