Re: [EXTERNAL] Getting rid of Master/Slave nomenclature in Solr

2020-06-23 Thread Noble Paul
Do we even call it the master/slave mode? I thought we had 2 modes

* Standalone mode
* SolrCloud mode

On Wed, Jun 24, 2020 at 3:00 AM Tomás Fernández Löbbe
 wrote:
>
> I agree in general with what Trey and Jan said and have suggested. I
> personally like to use "leader/follower". It's true that somewhat collides
> with SolrCloud terminology, but that's not a problem IMO, now that replica
> types exist, the “role” of the replica (leader vs. non-leader/follower)
> doesn’t specify the internals of how they behave, the replica type defines
> that. So, in a non-SolrCloud world, they would still be leader/followers
> regardless of how they perform that role.
>
> I also agree that the name of the role is not that important, more the
> "mode" of the architecture needs to be renamed. We tend to refer to
> "SolrCloud mode" and "Master/Slave mode", the main part in all this (IMO)
> is to change that "mode" name. I kind of like Trey's suggestion of "Managed
> Clustering" vs. "Manual Clustering" Mode (Or "managed" vs "manual"), but
> still haven't made up my mind (especially the fact that "manual" usually
> doesn't really mean "manual", is just "you build your tools”)…
>
> On Fri, Jun 19, 2020 at 1:38 PM Walter Underwood 
> wrote:
>
> > > On Jun 19, 2020, at 7:48 AM, Phill Campbell
> >  wrote:
> > >
> > > Delegator - Handler
> > >
> > > A common pattern we are all aware of. Pretty simple.
> >
> > The Solr master does not delegate and the slave does not handle.
> > The master is a server that handles replication requests from the
> > slave.
> >
> > Delegator/handler is a common pattern, but it is not the pattern
> > that describes traditional Solr replication.
> >
> > wunder
> > Walter Underwood
> > wun...@wunderwood.org
> > http://observer.wunderwood.org/  (my blog)
> >
> >



-- 
-
Noble Paul


solr fq with contains not returning any results

2020-06-23 Thread yaswanth kumar
I am using solr 8.2

And when trying to do fq=auto_nsallschools:*bostonschool*, the data is not
being returned. But if I do the same in solr 5.5 (which I already have and
we are in process of migrating to 8.2 ) its returning results.

if I do fq=auto_nsallschools:bostonschool
or
fq=auto_nsallschools:bostonschool* its returning results but when I try
with contains like described above or fq=auto_nsallschools:*bostonschool
(ends with) it's not returning any results.

The field which we are already using is a copy field and multi valued, am I
doing something wrong? or does 8.2 need some adjustment in the configs?

Here is the schema






 
  



  




  
































  


Thanks,

-- 
Thanks & Regards,
Yaswanth Kumar Konathala.
yaswanth...@gmail.com


Restored collection cluster status rendering some values as Long (as opposed to String for other collections)

2020-06-23 Thread Aliaksandr Asiptsou
Hello Solr experts,

Our team noticed the below behavior:

1. A collection is restored from a backup, and a replication factor is 
specified within the restore command:

/solr/admin/collections?action=RESTORE=backup_name=/backups/solr=collection_name=config_name=1=1

2. Collection restored successfully, but looking into cluster status we see 
several values are rendered as Long for this particular collection:

/solr/admin/collections?action=clusterstatus=xml

0
1
1
false
1
0
138

Whereas for all the other collections pullReplicas, replicationFactor, 
nrtReplicas and tlogReplicas are Strings.

Please advise whether it is known and expected or it needs to be fixed (if so, 
is there a Jira ticket already for this or should we create one)?

Best regards,
Aliaksandr Asiptsou


Re: replica deleted but directory remains

2020-06-23 Thread Erick Erickson
In a word, “yes”. What it looks like is that the information in 
Zookeeper has been updated to reflect the deletion. But since
node for some mysterious reason wasn’t available when the replica
was deleted, the data couldn’t be removed.

Best,
Erick

> On Jun 23, 2020, at 12:58 PM, Odysci  wrote:
> 
> Hi,
> I've got a solrcloud configuration with 2 shards and 2 replicas each.
> For some unknown reason, one of the replicas was on "recovery" mode
> forever, so I decided to create another replica, which went fine.
> Then I proceeded to delete the old replica (using the SOlr UI). After a
> while the interface gave me a msg about not being able to connect to the
> solr node. But once i refreshed it, the old replica was no longer showing
> in the interface, and the new replica was active.
> However, the directory in disk for the old replica is still there (and it's
> size is larger than originally).
> In a previous time when I did this in the exact the same way, the directory
> was removed.
> 
> My quesion is, can I manually delete the directory for the old replica?
> Or is there a solr command that will do this cleanly?
> Thanks
> 
> Reinaldo



Re: Retrieve disk usage & release disk space after delete

2020-06-23 Thread ChienHuaWang
Q1: I'm looking for the disk usage data in Solr admin - Solr Cloud/Nodes. Any
way to get the Node data in table thru API call?

Q2: Thanks for helpful information about deleting the data. 
The main issue I have now is for deleting collections, even if I delete by
admin UI, this suppose not to hang around in data dirs? not observing
specific error message so far, anything could induce? 

Regards,
Chien



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: [EXTERNAL] Getting rid of Master/Slave nomenclature in Solr

2020-06-23 Thread Tomás Fernández Löbbe
I agree in general with what Trey and Jan said and have suggested. I
personally like to use "leader/follower". It's true that somewhat collides
with SolrCloud terminology, but that's not a problem IMO, now that replica
types exist, the “role” of the replica (leader vs. non-leader/follower)
doesn’t specify the internals of how they behave, the replica type defines
that. So, in a non-SolrCloud world, they would still be leader/followers
regardless of how they perform that role.

I also agree that the name of the role is not that important, more the
"mode" of the architecture needs to be renamed. We tend to refer to
"SolrCloud mode" and "Master/Slave mode", the main part in all this (IMO)
is to change that "mode" name. I kind of like Trey's suggestion of "Managed
Clustering" vs. "Manual Clustering" Mode (Or "managed" vs "manual"), but
still haven't made up my mind (especially the fact that "manual" usually
doesn't really mean "manual", is just "you build your tools”)…

On Fri, Jun 19, 2020 at 1:38 PM Walter Underwood 
wrote:

> > On Jun 19, 2020, at 7:48 AM, Phill Campbell
>  wrote:
> >
> > Delegator - Handler
> >
> > A common pattern we are all aware of. Pretty simple.
>
> The Solr master does not delegate and the slave does not handle.
> The master is a server that handles replication requests from the
> slave.
>
> Delegator/handler is a common pattern, but it is not the pattern
> that describes traditional Solr replication.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
>


replica deleted but directory remains

2020-06-23 Thread Odysci
Hi,
I've got a solrcloud configuration with 2 shards and 2 replicas each.
For some unknown reason, one of the replicas was on "recovery" mode
forever, so I decided to create another replica, which went fine.
Then I proceeded to delete the old replica (using the SOlr UI). After a
while the interface gave me a msg about not being able to connect to the
solr node. But once i refreshed it, the old replica was no longer showing
in the interface, and the new replica was active.
However, the directory in disk for the old replica is still there (and it's
size is larger than originally).
In a previous time when I did this in the exact the same way, the directory
was removed.

My quesion is, can I manually delete the directory for the old replica?
Or is there a solr command that will do this cleanly?
Thanks

Reinaldo


Re: Retrieve disk usage & release disk space after delete

2020-06-23 Thread Walter Underwood
We get disk usage on volumes using Telegraf.

I’m planning on writing something that gathers size info (docs and bytes) 
by getting core info from the CLUSTERSTATUS request then using the
CoreAdmin API to get the detailed info about cores. It doesn’t look hard,
just complicated. Fire up Python and start walking JSON data.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Jun 23, 2020, at 4:27 AM, Erick Erickson  wrote:
> 
> Q1: If you’re talking about disk space used up by deleted documents,
> then yes, optimize or expungeDeletes will recover it. The former
>will recover it all, the latter will rewrite segments with > 10% deleted
>   documents. HOWEVER: optimize is an expensive operation, and
>can have deleterious side-effects, especially before Solr 7.5, see:
>   
> https://lucidworks.com/post/segment-merging-deleted-documents-optimize-may-bad/
>   and
>   https://lucidworks.com/post/solr-and-optimizing-your-index-take-ii/
> 
>   NOTE: if you just ignore it, the deleted data will be merged away as
>   part of normal indexing so you may have to do nothing.
> 
> Q2: The data if you delete the collections should be removed from 
>   disk, assuming you’re  talking about using the Collections API, 
>   DELETE command. Optimize won’t help because the collection is gone.
>   If you delete the collection and the data dirs are still hanging around,
>   you should look at your logs to see if there’s any information.
> 
> Best,
> Erick
> 
>> On Jun 22, 2020, at 9:04 PM, ChienHuaWang  wrote:
>> 
>> Hi Solr users,
>> 
>> Q1: Wondering if there is any way to retrieve disk usage by host? Could we
>> get thru metrics API or any other methods? I know the data shows in Solr
>> Admin UI, but have other approach for this kind of data.
>> 
>> Q2: 
>> After delete the collections, it seems not physically removed from the disk.
>> Did the research, someone suggest to run an optimize which re-writes the
>> index out to disk without the deleted documents, then deletes the original. 
>> Is there any other way to do clean up without re-writes the index? have to
>> manually clean up now, and look for better approach
>> 
>> Appreciate your feedback.
>> 
>> 
>> Regards,
>> Chien
>> 
>> 
>> 
>> 
>> 
>> --
>> Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
> 



Re: Almost nodes in Solrcloud dead suddently

2020-06-23 Thread Tran Van Hoan
 I checked node exporter metrics and saw network no problem

On Tuesday, June 23, 2020, 8:37:41 PM GMT+7, Tran Van Hoan 
 wrote:  
 
  I check node exporter, no problem with OS, hardware and network.I attached 
images about solr metrics 7 days and 12h.

On Tuesday, June 23, 2020, 2:23:05 PM GMT+7, Dario Rigolin 
 wrote:  
 
 What about a network issue?

Il giorno mar 23 giu 2020 alle ore 01:37 Tran Van Hoan
 ha scritto:

>
> dear all,
>
>  I have a solr cloud 8.2.0 with 6 instance per 6 server (64G RAM), each
> instance has xmx = xms = 30G.
>
> Today almost nodes in the solrcloud were dead 2 times from 8:00AM (5/6
> nodes were down) and 1:00PM (2/6 nodes  were down). yesterday,  One node
> were down. almost metrics didn't increase too much except threads.
>
> Performance in one week ago:
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> performace 12h ago:
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> I go to the admin UI, some node dead some node too long to response. When
> checking logfile, they generate too much (log level warning), here are logs
> which appears in the solr cloud:
>
> Log before server 4 and 6 down
>
> - Server 4 before it dead:
>
>    + o.a.s.h.RequestHandlerBase java.io.IOException:
> java.util.concurrent.TimeoutException: Idle timeout expired: 12/12
> ms
>
>  +org.apache.solr.client.solrj.SolrServerException: Timeout occured while
> waiting response from server at:
> http://server6:8983/solr/mycollection_shard3_replica_n5/select
>
>
>
> at
> org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:406)
>
>                at
> org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:746)
>
>                at
> org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1274)
>
>                at
> org.apache.solr.handler.component.HttpShardHandler.request(HttpShardHandler.java:238)
>
>                at
> org.apache.solr.handler.component.HttpShardHandler.lambda$submit$0(HttpShardHandler.java:199)
>
>                at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>
>                at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>
>                at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>
>                at
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:181)
>
>                at
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:209)
>
>                at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>
>                at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>
>                ... 1 more
>
> Caused by: java.util.concurrent.TimeoutException
>
>                at
> org.eclipse.jetty.client.util.InputStreamResponseListener.get(InputStreamResponseListener.java:216)
>
>                at
> org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:397)
>
>                ... 12 more
>
>
>
> + o.a.s.s.HttpSolrCall invalid return code: -1
>
> + o.a.s.s.PKIAuthenticationPlugin Invalid key request timestamp:
> 1592803662746 , received timestamp: 1592803796152 , TTL: 12
>
> + o.a.s.s.PKIAuthenticationPlugin Decryption failed , key must be wrong =>
> java.security.InvalidKeyException: No installed provider supports this key:
> (null)
>
> +  o.a.s.u.ErrorReportingConcurrentUpdateSolrClient Error when calling
> SolrCmdDistributor$Req: cmd=delete{,commitWithin=-1}; node=ForwardNode:
> http://server6:8983/solr/mycollection_shard3_replica_n5/ to
> http://server6:8983/solr/mycollection_shard3_replica_n5/ =>
> java.util.concurrent.TimeoutException
>
> + o.a.s.s.HttpSolrCall
> null:org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException:
> Async exception during distributed update: null
>
>
>
> Server 2:
>
>  + Max requests queued per destination 3000 exceeded for
> HttpDestination[http://server4:8983
> ]@7d7ec93c,queue=3000,pool=MultiplexConnectionPool@73b938e3
> [c=4/4,b=4,m=0,i=0]
>
>  +  Max requests queued per destination 3000 exceeded for
> HttpDestination[http://server5:8983
> ]@7d7ec93c,queue=3000,pool=MultiplexConnectionPool@73b938e3
> [c=4/4,b=4,m=0,i=0]
>
>
>
> + Timeout occured while waiting response from server at:
> http://server4:8983/solr/mycollection_shard6_replica_n23/select
>
> + Timeout occured while waiting response from server at:
> http://server6:8983/solr/mycollection_shard2_replica_n15/select
>
> +  o.a.s.s.HttpSolrCall null:org.apache.solr.common.SolrException:
> org.apache.solr.client.solrj.SolrServerException: IOException occured when
> talking to server at: null
>
> Caused by: org.apache.solr.client.solrj.SolrServerException: IOException
> occured when talking to server at: null
>
> Caused by: java.nio.channels.ClosedChannelException
>
>
>
> Server 6:
>
>  + 

Re: Index file on Windows fileshare..

2020-06-23 Thread Erick Erickson
The program I pointed you to should take about an hour to make work.

But otherwise, you can try the post tool:
https://lucene.apache.org/solr/guide/7_2/post-tool.html

Best,
Erick

> On Jun 23, 2020, at 8:45 AM, Fiz N  wrote:
> 
> Thanks Erick. Is there easy way of doing this? Index files from windows
> share folder to SOLR.
> This is for POC only.
> 
> Thanks
> Nadian.
> 
> On Mon, Jun 22, 2020 at 3:54 PM Erick Erickson 
> wrote:
> 
>> Consider running Tika in a client and indexing the docs to Solr.
>> At that point, you have total control over what’s indexed.
>> 
>> Here’s a skeletal program to get you started:
>> https://lucidworks.com/post/indexing-with-solrj/
>> 
>> Best,
>> Erick
>> 
>>> On Jun 22, 2020, at 1:21 PM, Fiz N  wrote:
>>> 
>>> Hello Solr experts,
>>> 
>>> I am using standalone version of SOLR 8.5 on Windows machine.
>>> 
>>> 1)  I want to index all types of files under different directory in the
>>> file share.
>>> 
>>> 2) I need to index  absolute path of the files and store it solr field. I
>>> need that info so that end user can click and open the file(Pop-up)
>>> 
>>> Could you please tell me how to go about this?
>>> This is for POC purpose once we finalize the solution we would be further
>>> going ahead with stable approach.
>>> 
>>> Thanks
>>> Fiz Nadian.
>> 
>> 



Re: Index file on Windows fileshare..

2020-06-23 Thread Fiz N
Thanks Erick. Is there easy way of doing this? Index files from windows
share folder to SOLR.
This is for POC only.

Thanks
Nadian.

On Mon, Jun 22, 2020 at 3:54 PM Erick Erickson 
wrote:

> Consider running Tika in a client and indexing the docs to Solr.
> At that point, you have total control over what’s indexed.
>
> Here’s a skeletal program to get you started:
> https://lucidworks.com/post/indexing-with-solrj/
>
> Best,
> Erick
>
> > On Jun 22, 2020, at 1:21 PM, Fiz N  wrote:
> >
> > Hello Solr experts,
> >
> > I am using standalone version of SOLR 8.5 on Windows machine.
> >
> > 1)  I want to index all types of files under different directory in the
> > file share.
> >
> > 2) I need to index  absolute path of the files and store it solr field. I
> > need that info so that end user can click and open the file(Pop-up)
> >
> > Could you please tell me how to go about this?
> > This is for POC purpose once we finalize the solution we would be further
> > going ahead with stable approach.
> >
> > Thanks
> > Fiz Nadian.
>
>


Re: Sorting in other collection in Solr 8.5.1

2020-06-23 Thread Erick Erickson
You have two separate collections with dissimilar data, so what
does “sorting them in the same order” mean? Your example
sorts on title, so why can’t you sort them both on title? That won’t
work of course for any field that isn’t identical in both
collections.

These are actually pretty small collections. It sounds like you’re 
doing what in SQL terms would be a sub-select. Have you considered
putting all the records (with different types) in the same collection
and using something like join queries or RerankQParser?

Don’t know how that fits into your model….

Best,
Erick

> On Jun 23, 2020, at 2:06 AM, vishal patel  
> wrote:
> 
> Hi
> 
> I am upgrading Solr 8.5.1. I have created 2 shards and each has one replica.
> I have created 2 collection one is form and second is actionscomment.forms 
> related data are stored in form collection and actions of that forms are 
> stored in actionscomment collection.
> There are 10 lakh documents in form and 50 lakh documents in actionscomment 
> collection.
> 
> form schema.xml
>  multiValued="false" docValues="true"/>
>  docValues="true"/>
>  docValues="true"/>
>  omitNorms="true"/>
>  docValues="true"/>
> 
> actionscomment schema.xml
>  multiValued="false" docValues="true"/>
>  docValues="true"/>
>  docValues="true"/>
> 
>  omitNorms="true"/>
>  docValues="true"/>
>  docValues="true"/>
> 
> 
> 
> We are showing form listing using form and actionscomment collection. We are 
> showing only 250 records in form listing page. Our form listing columns are 
> id,title,form created date and action names. id,title,form created date and 
> action names come from form collection and action names come from 
> actionscomment collection. We want to give the sorting functionality for all 
> columns.It is easy to sort id, title and form created date because it is in 
> same collection.
> 
> For action name sorting, I execute 2 query. First I execute query in 
> actionscomment collection with sort field title and get the form_id list and 
> using those form_ids I execute in form collection. But I do not get the 
> proper sorting. Sometimes I got so many form ids and my second query length 
> becomes larger.
> How can I get data from form collection same as order of form id list came 
> from actionscomment?
> 
> Regards,
> Vishal Patel
> 



Re: Retrieve disk usage & release disk space after delete

2020-06-23 Thread Erick Erickson
Q1: If you’re talking about disk space used up by deleted documents,
 then yes, optimize or expungeDeletes will recover it. The former
will recover it all, the latter will rewrite segments with > 10% deleted
   documents. HOWEVER: optimize is an expensive operation, and
can have deleterious side-effects, especially before Solr 7.5, see:
   
https://lucidworks.com/post/segment-merging-deleted-documents-optimize-may-bad/
   and
   https://lucidworks.com/post/solr-and-optimizing-your-index-take-ii/

   NOTE: if you just ignore it, the deleted data will be merged away as
   part of normal indexing so you may have to do nothing.

Q2: The data if you delete the collections should be removed from 
   disk, assuming you’re  talking about using the Collections API, 
   DELETE command. Optimize won’t help because the collection is gone.
   If you delete the collection and the data dirs are still hanging around,
   you should look at your logs to see if there’s any information.

Best,
Erick

> On Jun 22, 2020, at 9:04 PM, ChienHuaWang  wrote:
> 
> Hi Solr users,
> 
> Q1: Wondering if there is any way to retrieve disk usage by host? Could we
> get thru metrics API or any other methods? I know the data shows in Solr
> Admin UI, but have other approach for this kind of data.
> 
> Q2: 
> After delete the collections, it seems not physically removed from the disk.
> Did the research, someone suggest to run an optimize which re-writes the
> index out to disk without the deleted documents, then deletes the original. 
> Is there any other way to do clean up without re-writes the index? have to
> manually clean up now, and look for better approach
> 
> Appreciate your feedback.
> 
> 
> Regards,
> Chien
> 
> 
> 
> 
> 
> --
> Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html



Re: Unable to set preferred leader

2020-06-23 Thread Erick Erickson
First of all, unless you have a lot of shards worrying about which one is the 
leader is
not worth the effort. That code was put in there to deal with a situation where 
there
were 100s of shards and when the system was cold-started they all could have 
their
leader be on the same node.

The extra work a leader does is actually quite minimal, so unless you have a 
lot of
leaders. I wouldn’t start to worry until on the order of 20-30, then I’d 
measure to 
be sure. And the extra work is during indexing when it has to distribute the 
updates
to followers, FWIW.

But to your question, I have no idea. I’d say “look at the logs”, but you’ve 
already
done that. What happens is the preferred leader gets inserted in the 
overseer_election
queue watching the current leader, then the current leader is moved to the end
of the election queue. This _should_ trigger the watch on the preferred leader
to take over. I wouldn’t necessarily expect error messages in the logs BTW, 
you’d
need to look at the INFO level messages for both the PreferredLeader, Overseer
and current leader in that order.

The other place that’d be interesting is where the preferred leader is in the 
leader
election queue for that shard after it’s all done. It actually shouldn’t be in 
the 
election queue at all on success.

Not much help I know. The code is in RebalanceLeaders.java along with some
explanatory notes.

Best,
Erick


> On Jun 23, 2020, at 3:43 AM, Karl Stoney 
>  wrote:
> 
> Hey,
> We have a SolrCloud collection with 8 replicas, and one of those replicas has 
> the `property.preferredleader: true` set.   However when we perform a 
> `REBALANCELEADERS` we get:
> 
> ```
> {
>  "responseHeader": {
>"status": 0,
>"QTime": 62268
>  },
>  "Summary": {
>"Failure": "Not all active replicas with preferredLeader property are 
> leaders"
>  },
>  "failures": {
>"shard1": {
>  "status": "failed",
>  "msg": "Could not change leder for slice shard1 to core_node9"
>}
>  }
> }
> ```
> 
> There is nothing in the solr logs on any of the nodes to indicate the reason 
> for the failure.
> 
> What I have noticed is that 4 of the nodes briefly go orange in the gui (eg 
> “down”), and for a moment 9 of them go into yellow (eg “recovering”), before 
> all becoming active again with the same (incorrect) leader.
> 
> We use the same model on 4 other collections to set the preferred leader to a 
> particular replica and they all work fine.
> 
> Does anyone have any ideas?
> 
> Thanks
> Karl
> Unless expressly stated otherwise in this email, this e-mail is sent on 
> behalf of Auto Trader Limited Registered Office: 1 Tony Wilson Place, 
> Manchester, Lancashire, M15 4FN (Registered in England No. 03909628). Auto 
> Trader Limited is part of the Auto Trader Group Plc group. This email and any 
> files transmitted with it are confidential and may be legally privileged, and 
> intended solely for the use of the individual or entity to whom they are 
> addressed. If you have received this email in error please notify the sender. 
> This email message has been swept for the presence of computer viruses.



RE: Log4J Logging to Http

2020-06-23 Thread Krönert Florian
Hi Radu,

thanks for your response.
Your different approach is very valuable to me, so thanks for suggesting it.

I'll take a look at the different tools you suggested. I hope there is some 
small and efficient solution for doing this, since throwing a whole Logstash, 
ElasticSearch and Kibana stack on top of it seems quite overwhelming.

Kind Regards,

Florian Krönert
Senior Software Developer


ORBIS AG | Planckstraße 10 | D-88677 Markdorf
Phone: +49 7544 50398 21 | Mobile: +49 162 3065972 | E-Mail: 
florian.kroen...@orbis.de
www.orbis.de



Registered Seat: Saarbrücken
Commercial Register Court: Amtsgericht Saarbrücken, HRB 12022
Board of Management: Thomas Gard (Chairman), Michael Jung, Stefan Mailänder, 
Frank Schmelzer
Chairman of the Supervisory Board: Ulrich Holzer






-Original Message-
From: Radu Gheorghe 
Sent: Donnerstag, 18. Juni 2020 08:24
To: solr-user@lucene.apache.org
Subject: Re: Log4J Logging to Http

Hi Florian,

I don’t know the answer to your specific question, but I would like to suggest 
a different approach. Excuse me in advance, I usually hate suggesting different 
approaches.

The reason why I suggest a different approach is because logging via HTTP can 
be blocking a thread e.g. until a timeout. I wrote a bit more here: 
https://sematext.com/blog/logging-libraries-vs-log-shippers/

In your particular case, I would let Solr log normally (to stdout) and have 
something pick the logs up from the Docker socket. I’m used to Logagent (see 
https://sematext.com/docs/logagent/installation-docker/) which can parse Solr 
logs out of the box (see 
https://github.com/sematext/logagent-js/blob/master/patterns.yml#L140). But 
there are other options, like Fluentd or Logstash.

Best regards,
Radu

> On 17 Jun 2020, at 10:33, Krönert Florian  wrote:
>
> Hello everyone,
>
> We want to log our queries to a HTTP endpoint and tried configuring our log4j 
> settings accordingly.
> We are using Solr inside Docker with the official Solr image (version 
> solr:8.3.1).
>
> As soon as we add a http appender, we receive errors on startup and solr 
> fails to start completely:
>
> 2020-06-17T07:06:54.976390509Z DEBUG StatusLogger
> JsonLayout$Builder(propertiesAsList="null",
> objectMessageAsJsonObject="null", ={}, eventEol="null",
> compact="null", complete="null", locationInfo="null",
> properties="true", includeStacktrace="null",
> stacktraceAsString="null", includeNullDelimiter="null", ={},
> charset="null", footerSerializer=null, headerSerializer=null,
> Configuration(/var/solr/log4j2.xml), footer="null", header="null")
> 2020-06-17T07:06:55.121825039Z 2020-06-17
> 07:06:55.104:WARN:oejw.WebAppContext:main: Failed startup of context
> o.e.j.w.WebAppContext@611df6e3{/solr,file:///opt/solr-8.3.1/server/sol
> r-webapp/webapp/,UNAVAILABLE}{/opt/solr-8.3.1/server/solr-webapp/webap
> p} 2020-06-17T07:06:55.121856339Z java.lang.NoClassDefFoundError:
> Failed to initialize Apache Solr: Could not find necessary SLF4j
> logging jars. If using Jetty, the SLF4j logging jars need to go in the
> jetty lib/ext directory. For other containers, the corresponding
> directory should be used. For more information, see:
> http://wiki.apache.org/solr/SolrLogging
>
> It seems that only when using the http appender these jars are needed, 
> without this appender everything works.
> Can you point me in the right direction, where I need to place the needed 
> jars? Seems to be a little special since I only access the /var/solr mount 
> directly, the rest is running in docker.
>
> Kind Regards,
>
> Florian Krönert
> Senior Software Developer
>
>
>
> ORBIS AG | Planckstraße 10 | D-88677 Markdorf
> Phone: +49 7544 50398 21 | Mobile: +49 162 3065972 | E-Mail:
> florian.kroen...@orbis.de www.orbis.de
>
>
>
>
> Registered Seat: Saarbrücken
> Commercial Register Court: Amtsgericht Saarbrücken, HRB 12022 Board of
> Management: Thomas Gard (Chairman), Michael Jung, Stefan Mailänder, Frank 
> Schmelzer
> Chairman of the Supervisory Board: Ulrich Holzer
>
>
>
>
>
>



RE: Log4J Logging to Http

2020-06-23 Thread Krönert Florian
Hi Shawn,

thanks for your response, I'll take a look, it seems the slf4j jars are missing 
in there.
The filesystem is not really something that I am going to tweak inside that 
Docker container, so I might take a look at different approaches as well.

Kind Regards,

Florian Krönert
Senior Software Developer


ORBIS AG | Planckstraße 10 | D-88677 Markdorf
Phone: +49 7544 50398 21 | Mobile: +49 162 3065972 | E-Mail: 
florian.kroen...@orbis.de
www.orbis.de



Registered Seat: Saarbrücken
Commercial Register Court: Amtsgericht Saarbrücken, HRB 12022
Board of Management: Thomas Gard (Chairman), Michael Jung, Stefan Mailänder, 
Frank Schmelzer
Chairman of the Supervisory Board: Ulrich Holzer






-Original Message-
From: Shawn Heisey 
Sent: Donnerstag, 18. Juni 2020 04:22
To: solr-user@lucene.apache.org
Subject: Re: Log4J Logging to Http

On 6/17/2020 1:33 AM, Krönert Florian wrote:
> 2020-06-17T07:06:55.121856339Z java.lang.NoClassDefFoundError: Failed
> to initialize Apache Solr: Could not find necessary SLF4j logging
> jars. If using Jetty, the SLF4j logging jars need to go in the jetty
> lib/ext directory. For other containers, the corresponding directory
> should be used. For more information, see:
> http://wiki.apache.org/solr/SolrLogging
>
> It seems that only when using the http appender these jars are needed,
> without this appender everything works.

There must be some aspect of your log4j2.xml configuration that requires a jar 
that is not included with Solr.

> Can you point me in the right direction, where I need to place the
> needed jars? Seems to be a little special since I only access the
> /var/solr mount directly, the rest is running in docker.

If there are extra jars needed for your logging config, they should go in the 
server/lib/ext directory, which should already exist and contain several jars 
related to logging.

Thanks,
Shawn


Unable to set preferred leader

2020-06-23 Thread Karl Stoney
Hey,
We have a SolrCloud collection with 8 replicas, and one of those replicas has 
the `property.preferredleader: true` set.   However when we perform a 
`REBALANCELEADERS` we get:

```
{
  "responseHeader": {
"status": 0,
"QTime": 62268
  },
  "Summary": {
"Failure": "Not all active replicas with preferredLeader property are 
leaders"
  },
  "failures": {
"shard1": {
  "status": "failed",
  "msg": "Could not change leder for slice shard1 to core_node9"
}
  }
}
```

There is nothing in the solr logs on any of the nodes to indicate the reason 
for the failure.

What I have noticed is that 4 of the nodes briefly go orange in the gui (eg 
“down”), and for a moment 9 of them go into yellow (eg “recovering”), before 
all becoming active again with the same (incorrect) leader.

We use the same model on 4 other collections to set the preferred leader to a 
particular replica and they all work fine.

Does anyone have any ideas?

Thanks
Karl
Unless expressly stated otherwise in this email, this e-mail is sent on behalf 
of Auto Trader Limited Registered Office: 1 Tony Wilson Place, Manchester, 
Lancashire, M15 4FN (Registered in England No. 03909628). Auto Trader Limited 
is part of the Auto Trader Group Plc group. This email and any files 
transmitted with it are confidential and may be legally privileged, and 
intended solely for the use of the individual or entity to whom they are 
addressed. If you have received this email in error please notify the sender. 
This email message has been swept for the presence of computer viruses.


Retrieve disk usage & release disk space after delete

2020-06-23 Thread ChienHuaWang
Hi Solr users,

Q1: Wondering if there is any way to retrieve disk usage by host? Could we
get thru metrics API or any other methods? I know the data shows in Solr
Admin UI, but have other approach for this kind of data.

Q2: 
After delete the collections, it seems not physically removed from the disk.
Did the research, someone suggest to run an optimize which re-writes the
index out to disk without the deleted documents, then deletes the original. 
Is there any other way to do clean up without re-writes the index? have to
manually clean up now, and look for better approach

Appreciate your feedback.


Regards,
Chien





--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Almost nodes in Solrcloud dead suddently

2020-06-23 Thread Dario Rigolin
What about a network issue?

Il giorno mar 23 giu 2020 alle ore 01:37 Tran Van Hoan
 ha scritto:

>
> dear all,
>
>  I have a solr cloud 8.2.0 with 6 instance per 6 server (64G RAM), each
> instance has xmx = xms = 30G.
>
> Today almost nodes in the solrcloud were dead 2 times from 8:00AM (5/6
> nodes were down) and 1:00PM (2/6 nodes  were down). yesterday,  One node
> were down. almost metrics didn't increase too much except threads.
>
> Performance in one week ago:
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> performace 12h ago:
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> I go to the admin UI, some node dead some node too long to response. When
> checking logfile, they generate too much (log level warning), here are logs
> which appears in the solr cloud:
>
> Log before server 4 and 6 down
>
> - Server 4 before it dead:
>
>+ o.a.s.h.RequestHandlerBase java.io.IOException:
> java.util.concurrent.TimeoutException: Idle timeout expired: 12/12
> ms
>
>   +org.apache.solr.client.solrj.SolrServerException: Timeout occured while
> waiting response from server at:
> http://server6:8983/solr/mycollection_shard3_replica_n5/select
>
>
>
> at
> org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:406)
>
> at
> org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:746)
>
> at
> org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1274)
>
> at
> org.apache.solr.handler.component.HttpShardHandler.request(HttpShardHandler.java:238)
>
> at
> org.apache.solr.handler.component.HttpShardHandler.lambda$submit$0(HttpShardHandler.java:199)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>
> at
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:181)
>
> at
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:209)
>
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>
> ... 1 more
>
> Caused by: java.util.concurrent.TimeoutException
>
> at
> org.eclipse.jetty.client.util.InputStreamResponseListener.get(InputStreamResponseListener.java:216)
>
> at
> org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:397)
>
> ... 12 more
>
>
>
> + o.a.s.s.HttpSolrCall invalid return code: -1
>
> + o.a.s.s.PKIAuthenticationPlugin Invalid key request timestamp:
> 1592803662746 , received timestamp: 1592803796152 , TTL: 12
>
> + o.a.s.s.PKIAuthenticationPlugin Decryption failed , key must be wrong =>
> java.security.InvalidKeyException: No installed provider supports this key:
> (null)
>
> +  o.a.s.u.ErrorReportingConcurrentUpdateSolrClient Error when calling
> SolrCmdDistributor$Req: cmd=delete{,commitWithin=-1}; node=ForwardNode:
> http://server6:8983/solr/mycollection_shard3_replica_n5/ to
> http://server6:8983/solr/mycollection_shard3_replica_n5/ =>
> java.util.concurrent.TimeoutException
>
> + o.a.s.s.HttpSolrCall
> null:org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException:
> Async exception during distributed update: null
>
>
>
> Server 2:
>
>  + Max requests queued per destination 3000 exceeded for
> HttpDestination[http://server4:8983
> ]@7d7ec93c,queue=3000,pool=MultiplexConnectionPool@73b938e3
> [c=4/4,b=4,m=0,i=0]
>
>  +  Max requests queued per destination 3000 exceeded for
> HttpDestination[http://server5:8983
> ]@7d7ec93c,queue=3000,pool=MultiplexConnectionPool@73b938e3
> [c=4/4,b=4,m=0,i=0]
>
>
>
> + Timeout occured while waiting response from server at:
> http://server4:8983/solr/mycollection_shard6_replica_n23/select
>
> + Timeout occured while waiting response from server at:
> http://server6:8983/solr/mycollection_shard2_replica_n15/select
>
> +   o.a.s.s.HttpSolrCall null:org.apache.solr.common.SolrException:
> org.apache.solr.client.solrj.SolrServerException: IOException occured when
> talking to server at: null
>
> Caused by: org.apache.solr.client.solrj.SolrServerException: IOException
> occured when talking to server at: null
>
> Caused by: java.nio.channels.ClosedChannelException
>
>
>
> Server 6:
>
>  + org.apache.solr.client.solrj.SolrServerException: Timeout occured while
> waiting response from server at:
> http://server6:8983/solr/mycollection_shard2_replica_n15/select
>
>  + + org.apache.solr.client.solrj.SolrServerException: Timeout occured
> while waiting response from server at: Timeout occured while waiting
> response 

Re: Query takes more time in Solr 8.5.1 compare to 6.1.0 version

2020-06-23 Thread vishal patel
Is there any other option?

Sent from Outlook

From: Mikhail Khludnev 
Sent: Sunday, May 24, 2020 3:24 AM
To: solr-user 
Subject: Re: Query takes more time in Solr 8.5.1 compare to 6.1.0 version

Unfortunately {!terms} doesn't let one ^boost terms.

On Sat, May 23, 2020 at 10:13 AM vishal patel 
wrote:

> Hi Jason
>
> Thanks for reply.
>
> I have checked jay's query using "terms" query parser and it is really
> helpful to us. After execute using "terms" query parser it will come within
> a 500 milliseconds even though grouping is applied.
> Jay's Query :
> https://drive.google.com/file/d/1bavCqwHfJxoKHFzdOEt-mSG8N0fCHE-w/view
>
> Actually I want to apply same things in my query but my field "msg_id" is
> applied boost.group is also used in my query.
> I am also upgrading Solr 8.5.1.
>
>
> MY query is :
> https://drive.google.com/file/d/1Op_Ja292Bcnv0Ijxw6VdAxvGlfsdczmS/view
>
> I got 30 seconds for above query. How can I use the "terms" query parser
> in my query?
>
> Regards,
> Vishal Patel
> 
> From: Jason Gerlowski 
> Sent: Friday, May 22, 2020 2:59 AM
> To: solr-user@lucene.apache.org 
> Subject: Re: Query takes more time in Solr 8.5.1 compare to 6.1.0 version
>
> Hi Jay,
>
> I can't speak to why you're seeing a performance change between 6.x
> and 8.x.  What I can suggest though is an alternative way of
> formulating the query: you might get different performance if you run
> your query using Solr's "terms" query parser:
>
> https://lucene.apache.org/solr/guide/8_5/other-parsers.html#terms-query-parser
>  It's not guaranteed to help, but there's a chance it'll work for you.
> And knowing whether or not it helps might point others here towards
> the cause of your slowdown.
>
> Even if "terms" performs better for you, it's probably worth
> understanding what's going on here of course.
>
> Are all other queries running comparably?
>
> Jason
>
> On Thu, May 21, 2020 at 10:25 AM jay harkhani 
> wrote:
> >
> > Hello,
> >
> > Please refer below details.
> >
> > >Did you create Solrconfig.xml for the collection from scratch after
> upgrading and reindexing?
> > Yes, We have created collection from scratch and also re-indexing.
> >
> > >Was it based on the latest template?
> > Yes, It was as per latest template.
> >
> > >What happens if you reexecute the query?
> > Not more visible difference. Minor change in milliseconds.
> >
> > >Are there other processes/containers running on the same VM?
> > No
> >
> > >How much heap and how much total memory you have?
> > My heap and total memory are same as Solr 6.1.0. heap memory 5 gb and
> total memory 25gb. As per me there is no issue related to memory.
> >
> > >Maybe also you need to increase the corresponding caches in the config.
> > We are not using cache in both version.
> >
> > Both version have same configuration.
> >
> > Regards,
> > Jay Harkhani.
> >
> > 
> > From: Jörn Franke 
> > Sent: Thursday, May 21, 2020 7:05 PM
> > To: solr-user@lucene.apache.org 
> > Subject: Re: Query takes more time in Solr 8.5.1 compare to 6.1.0 version
> >
> > Did you create Solrconfig.xml for the collection from scratch after
> upgrading and reindexing? Was it based on the latest template?
> > If not then please try this. Maybe also you need to increase the
> corresponding caches in the config.
> >
> > What happens if you reexecute the query?
> >
> > Are there other processes/containers running on the same VM?
> >
> > How much heap and how much total memory you have? You should only have a
> minor fraction of the memory as heap and most of it „free“ (this means it
> is used for file caches).
> >
> >
> >
> > > Am 21.05.2020 um 15:24 schrieb vishal patel <
> vishalpatel200...@outlook.com>:
> > >
> > > Any one is looking this issue?
> > > I got same issue.
> > >
> > > Regards,
> > > Vishal Patel
> > >
> > >
> > >
> > > 
> > > From: jay harkhani 
> > > Sent: Wednesday, May 20, 2020 7:39 PM
> > > To: solr-user@lucene.apache.org 
> > > Subject: Query takes more time in Solr 8.5.1 compare to 6.1.0 version
> > >
> > > Hello,
> > >
> > > Currently I upgrade Solr version from 6.1.0 to 8.5.1 and come across
> one issue. Query which have more ids (around 3000) and grouping is applied
> takes more time to execute. In Solr 6.1.0 it takes 677ms and in Solr 8.5.1
> it takes 26090ms. While take reading we have same solr schema and same no.
> of records in both solr version.
> > >
> > > Please refer below details for query, logs and thead dump (generate
> from Solr Admin while execute query).
> > >
> > > Query :
> https://drive.google.com/file/d/1bavCqwHfJxoKHFzdOEt-mSG8N0fCHE-w/view
> > >
> > > Logs and Thread dump stack trace
> > > Solr 8.5.1 :
> https://drive.google.com/file/d/149IgaMdLomTjkngKHrwd80OSEa1eJbBF/view
> > > Solr 6.1.0 :
> https://drive.google.com/file/d/13v1u__fM8nHfyvA0Mnj30IhdffW6xhwQ/view
> > >
> > > To analyse further more 

Sorting in other collection in Solr 8.5.1

2020-06-23 Thread vishal patel
Hi

I am upgrading Solr 8.5.1. I have created 2 shards and each has one replica.
I have created 2 collection one is form and second is actionscomment.forms 
related data are stored in form collection and actions of that forms are stored 
in actionscomment collection.
There are 10 lakh documents in form and 50 lakh documents in actionscomment 
collection.

form schema.xml






actionscomment schema.xml










We are showing form listing using form and actionscomment collection. We are 
showing only 250 records in form listing page. Our form listing columns are 
id,title,form created date and action names. id,title,form created date and 
action names come from form collection and action names come from 
actionscomment collection. We want to give the sorting functionality for all 
columns.It is easy to sort id, title and form created date because it is in 
same collection.

For action name sorting, I execute 2 query. First I execute query in 
actionscomment collection with sort field title and get the form_id list and 
using those form_ids I execute in form collection. But I do not get the proper 
sorting. Sometimes I got so many form ids and my second query length becomes 
larger.
How can I get data from form collection same as order of form id list came from 
actionscomment?

Regards,
Vishal Patel