RE: Backup a solr cloud collection - timeout in 180s?

2018-04-10 Thread Petersen, Robert (Contr)
Erick:

Good to know!

Thx
Robi


-Original Message-
From: Erick Erickson <erickerick...@gmail.com> 
Sent: Tuesday, April 10, 2018 12:42 PM
To: solr-user <solr-user@lucene.apache.org>
Subject: Re: Backup a solr cloud collection - timeout in 180s?

Robi:

Yeah, the ref guide has lots and lots and lots of info, but at 1,100 pages and 
growing things can be "interesting" to find.

Do be aware of one thing. The async ID should be unique and before 7.3 there 
was a bug that if you used the same ID twice (without waiting for completion 
and deleting it first) it lead to bewildering results.
See: https://issues.apache.org/jira/browse/SOLR-11739.

The operations would succeed, but you might not be getting the status of the 
task you think you are.


Best,
Erick

On Tue, Apr 10, 2018 at 9:25 AM, Petersen, Robert (Contr) 
<robert.peters...@ftr.com> wrote:
> HI Erick,
>
>
> I *just* found that parameter in the guide... it was waaay down at the bottom 
> of the page (in proverbial small print)!
>
>
> So for other readers the steps are this:
>
> # start the backup async enabled
>
> /admin/collections?action=BACKUP=addrsearchBackup=addr
> search=/apps/logs/backups=1234
>
>
> # check on the status of the async job
>
> /admin/collections?action=REQUESTSTATUS=1234
>
>
> # clear out the status when done
>
> /admin/collections?action=DELETESTATUS=1234
>
>
> Thx
>
> Robi
>
> 
> From: Erick Erickson <erickerick...@gmail.com>
> Sent: Tuesday, April 10, 2018 8:24:20 AM
> To: solr-user
> Subject: Re: Backup a solr cloud collection - timeout in 180s?
>
> 
> WARNING: External email. Please verify sender before opening attachments or 
> clicking on links.
> 
>
>
>
> Specify the "async" property, see:
> https://lucene.apache.org/solr/guide/6_6/collections-api.html
>
> There's also a way to check the status of the backup running in the 
> background.
>
> Best,
> Erick
>
> On Mon, Apr 9, 2018 at 11:05 AM, Petersen, Robert (Contr) 
> <robert.peters...@ftr.com> wrote:
>> Shouldn't this just create the backup file(s) asynchronously? Can the 
>> timeout be adjusted?
>>
>>
>> Solr 7.2.1 with five nodes and the addrsearch collection is five 
>> shards x five replicas and "numFound":38837970 docs
>>
>>
>> Thx
>>
>> Robi
>>
>>
>> http://myServer.corp.pvt:8983/solr/admin/collections?action=BACKUP
>> me=addrsearchBackup=addrsearch=/apps/logs/backups
>>
>>
>>   *
>>  *
>> responseHeader:
>> {
>> *
>> status: 500,
>> *
>> QTime: 180211
>> },
>>  *
>> error:
>> {
>> *
>> metadata:
>> [
>>*
>> "error-class",
>>*
>> "org.apache.solr.common.SolrException",
>>*
>> "root-error-class",
>>*
>> "org.apache.solr.common.SolrException"
>> ],
>> *
>> msg: "backup the collection time out:180s",
>>   *
>>
>>
>> From the logs:
>>
>>
>> 2018-04-09 17:47:32.667 INFO  (qtp64830413-22) [   ] o.a.s.s.HttpSolrCall 
>> [admin] webapp=null path=/admin/collections 
>> params={name=addrsearchBackup=BACKUP=/apps/logs/backups=addrsearch}
>>  status=500 QTime=180211
>> 2018-04-09 17:47:32.667 ERROR (qtp64830413-22) [   ] o.a.s.s.HttpSolrCall 
>> null:org.apache.solr.common.SolrException: backup the collection time 
>> out:180s
>> at 
>> org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:314)
>> at 
>> org.apache.solr.handler.admin.CollectionsHandler.invokeAction(CollectionsHandler.java:246)
>> at 
>> org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:224)
>> at 
>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:177)
>> at 
>> org.apache.solr.servlet.HttpSolrCall.handleAdmin(HttpSolrCall.java:735)
>> at 
>> org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:716)
>> at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:497)
>> at 
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:382)
>> at 
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilte
>> r.java:326)
>>
>>
>>
>> 
>>
>> This communication is confidential. Frontier only sends and receives email 
>> on the basis of the terms set out at 
>> http://www.frontier.com/email_disclaimer.


Re: Backup a solr cloud collection - timeout in 180s?

2018-04-10 Thread Petersen, Robert (Contr)
HI Erick,


I *just* found that parameter in the guide... it was waaay down at the bottom 
of the page (in proverbial small print)!


So for other readers the steps are this:

# start the backup async enabled

/admin/collections?action=BACKUP=addrsearchBackup=addrsearch=/apps/logs/backups=1234


# check on the status of the async job

/admin/collections?action=REQUESTSTATUS=1234


# clear out the status when done

/admin/collections?action=DELETESTATUS=1234


Thx

Robi


From: Erick Erickson <erickerick...@gmail.com>
Sent: Tuesday, April 10, 2018 8:24:20 AM
To: solr-user
Subject: Re: Backup a solr cloud collection - timeout in 180s?


WARNING: External email. Please verify sender before opening attachments or 
clicking on links.




Specify the "async" property, see:
https://lucene.apache.org/solr/guide/6_6/collections-api.html

There's also a way to check the status of the backup running in the background.

Best,
Erick

On Mon, Apr 9, 2018 at 11:05 AM, Petersen, Robert (Contr)
<robert.peters...@ftr.com> wrote:
> Shouldn't this just create the backup file(s) asynchronously? Can the timeout 
> be adjusted?
>
>
> Solr 7.2.1 with five nodes and the addrsearch collection is five shards x 
> five replicas and "numFound":38837970 docs
>
>
> Thx
>
> Robi
>
>
> http://myServer.corp.pvt:8983/solr/admin/collections?action=BACKUP=addrsearchBackup=addrsearch=/apps/logs/backups
>
>
>   *
>  *
> responseHeader:
> {
> *
> status: 500,
> *
> QTime: 180211
> },
>  *
> error:
> {
> *
> metadata:
> [
>*
> "error-class",
>*
> "org.apache.solr.common.SolrException",
>*
> "root-error-class",
>*
> "org.apache.solr.common.SolrException"
> ],
> *
> msg: "backup the collection time out:180s",
>   *
>
>
> From the logs:
>
>
> 2018-04-09 17:47:32.667 INFO  (qtp64830413-22) [   ] o.a.s.s.HttpSolrCall 
> [admin] webapp=null path=/admin/collections 
> params={name=addrsearchBackup=BACKUP=/apps/logs/backups=addrsearch}
>  status=500 QTime=180211
> 2018-04-09 17:47:32.667 ERROR (qtp64830413-22) [   ] o.a.s.s.HttpSolrCall 
> null:org.apache.solr.common.SolrException: backup the collection time out:180s
> at 
> org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:314)
> at 
> org.apache.solr.handler.admin.CollectionsHandler.invokeAction(CollectionsHandler.java:246)
> at 
> org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:224)
> at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:177)
> at 
> org.apache.solr.servlet.HttpSolrCall.handleAdmin(HttpSolrCall.java:735)
> at 
> org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:716)
> at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:497)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:382)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:326)
>
>
>
> 
>
> This communication is confidential. Frontier only sends and receives email on 
> the basis of the terms set out at http://www.frontier.com/email_disclaimer.


Backup a solr cloud collection - timeout in 180s?

2018-04-09 Thread Petersen, Robert (Contr)
Shouldn't this just create the backup file(s) asynchronously? Can the timeout 
be adjusted?


Solr 7.2.1 with five nodes and the addrsearch collection is five shards x five 
replicas and "numFound":38837970 docs


Thx

Robi


http://myServer.corp.pvt:8983/solr/admin/collections?action=BACKUP=addrsearchBackup=addrsearch=/apps/logs/backups


  *
 *
responseHeader:
{
*
status: 500,
*
QTime: 180211
},
 *
error:
{
*
metadata:
[
   *
"error-class",
   *
"org.apache.solr.common.SolrException",
   *
"root-error-class",
   *
"org.apache.solr.common.SolrException"
],
*
msg: "backup the collection time out:180s",
  *


>From the logs:


2018-04-09 17:47:32.667 INFO  (qtp64830413-22) [   ] o.a.s.s.HttpSolrCall 
[admin] webapp=null path=/admin/collections 
params={name=addrsearchBackup=BACKUP=/apps/logs/backups=addrsearch}
 status=500 QTime=180211
2018-04-09 17:47:32.667 ERROR (qtp64830413-22) [   ] o.a.s.s.HttpSolrCall 
null:org.apache.solr.common.SolrException: backup the collection time out:180s
at 
org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:314)
at 
org.apache.solr.handler.admin.CollectionsHandler.invokeAction(CollectionsHandler.java:246)
at 
org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:224)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:177)
at 
org.apache.solr.servlet.HttpSolrCall.handleAdmin(HttpSolrCall.java:735)
at 
org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:716)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:497)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:382)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:326)





This communication is confidential. Frontier only sends and receives email on 
the basis of the terms set out at http://www.frontier.com/email_disclaimer.


CDCR - cross data center replication

2018-01-25 Thread Petersen, Robert (Contr)
Hi all,


So for an initial CDCR setup documentation says bulk load should be performed 
first otherwise CDCR won't keep up. By bulk load does that include an ETL 
process doing rapid atomic updates one doc at a time (with multiple threads) so 
like 4K docs per minute assuming bandwidth between DCs is actually good?


Also as a follow up question, in the documentation it says to do the bulk load 
first and sync the data centers then turn on CDCR, what is recommended for the 
initial sync? A Solr backup and restore?


Thanks

Robi


CDCR is unlikely to be satisfactory for bulk-load situations where the update 
rate is high, especially if the bandwidth between the Source and Target 
clusters is restricted. In this scenario, the initial bulk load should be 
performed, the Source and Target data centers synchronized and CDCR be utilized 
for incremental updates.



This communication is confidential. Frontier only sends and receives email on 
the basis of the terms set out at http://www.frontier.com/email_disclaimer.


Re: solr 5.4.1 leader issue

2018-01-08 Thread Petersen, Robert (Contr)
OK just restarting all the solr nodes did fix it, since they are in production 
I was hesitant to do that


From: Petersen, Robert (Contr) <robert.peters...@ftr.com>
Sent: Monday, January 8, 2018 12:34:28 PM
To: solr-user@lucene.apache.org
Subject: solr 5.4.1 leader issue

Hi got two out of my three servers think they are replicas on one shard getting 
exceptions wondering what is the easiest way to fix this? Can I just restart 
zookeeper across the servers? Here are the exceptions:


TY

Robi


ERROR
null
RecoveryStrategy
Error while trying to recover. 
core=custsearch_shard3_replica1:java.util.concurrent.ExecutionException: 
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
from server at http://x.x.x.x:8983/solr: We are not the leader
Error while trying to recover. 
core=custsearch_shard3_replica1:java.util.concurrent.ExecutionException: 
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
from server at http://x.x.x.x:8983/solr: We are not the leader
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at 
org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:607)
at org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:364)
at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:226)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:232)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: 
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
from server at http://10.209.55.10:8983/solr: We are not the leader
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:575)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient$1.call(HttpSolrClient.java:285)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient$1.call(HttpSolrClient.java:281)
... 5 more
(and on the one everyone thinks is the leader)
Error while trying to recover. 
core=custsearch_shard3_replica3:org.apache.solr.common.SolrException: Cloud 
state still says we are leader.
at org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:332)
at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:226)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:232)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)




This communication is confidential. Frontier only sends and receives email on 
the basis of the terms set out at http://www.frontier.com/email_disclaimer.


Re: solr 5.4.1 leader issue

2018-01-08 Thread Petersen, Robert (Contr)
Perhaps I didn't explain well, three nodes live. Two are in recovering mode 
exception being they cant get to the Leader because the Leader replies that he 
is not the leader. On the dashboard it shows him as the leader but he thinks he 
isn't. The exceptions are below... Do I have to just restart the solr 
instances, the zookeeper instances, both, or is there another better way 
without restarting everything?


Thx

Robi


From: Petersen, Robert (Contr) <robert.peters...@ftr.com>
Sent: Monday, January 8, 2018 12:34:28 PM
To: solr-user@lucene.apache.org
Subject: solr 5.4.1 leader issue

Hi got two out of my three servers think they are replicas on one shard getting 
exceptions wondering what is the easiest way to fix this? Can I just restart 
zookeeper across the servers? Here are the exceptions:


TY

Robi


ERROR
null
RecoveryStrategy
Error while trying to recover. 
core=custsearch_shard3_replica1:java.util.concurrent.ExecutionException: 
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
from server at http://x.x.x.x:8983/solr: We are not the leader
Error while trying to recover. 
core=custsearch_shard3_replica1:java.util.concurrent.ExecutionException: 
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
from server at http://x.x.x.x:8983/solr: We are not the leader
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at 
org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:607)
at org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:364)
at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:226)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:232)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: 
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
from server at http://10.209.55.10:8983/solr: We are not the leader
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:575)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient$1.call(HttpSolrClient.java:285)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient$1.call(HttpSolrClient.java:281)
... 5 more
(and on the one everyone thinks is the leader)
Error while trying to recover. 
core=custsearch_shard3_replica3:org.apache.solr.common.SolrException: Cloud 
state still says we are leader.
at org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:332)
at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:226)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:232)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)




This communication is confidential. Frontier only sends and receives email on 
the basis of the terms set out at http://www.frontier.com/email_disclaimer.


Re: solr 5.4.1 leader issue

2018-01-08 Thread Petersen, Robert (Contr)
I'm on zookeeper 3.4.8


From: Petersen, Robert (Contr) <robert.peters...@ftr.com>
Sent: Monday, January 8, 2018 12:34:28 PM
To: solr-user@lucene.apache.org
Subject: solr 5.4.1 leader issue

Hi got two out of my three servers think they are replicas on one shard getting 
exceptions wondering what is the easiest way to fix this? Can I just restart 
zookeeper across the servers? Here are the exceptions:


TY

Robi


ERROR
null
RecoveryStrategy
Error while trying to recover. 
core=custsearch_shard3_replica1:java.util.concurrent.ExecutionException: 
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
from server at http://x.x.x.x:8983/solr: We are not the leader
Error while trying to recover. 
core=custsearch_shard3_replica1:java.util.concurrent.ExecutionException: 
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
from server at http://x.x.x.x:8983/solr: We are not the leader
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at 
org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:607)
at org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:364)
at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:226)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:232)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: 
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
from server at http://10.209.55.10:8983/solr: We are not the leader
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:575)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient$1.call(HttpSolrClient.java:285)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient$1.call(HttpSolrClient.java:281)
... 5 more
(and on the one everyone thinks is the leader)
Error while trying to recover. 
core=custsearch_shard3_replica3:org.apache.solr.common.SolrException: Cloud 
state still says we are leader.
at org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:332)
at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:226)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:232)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)




This communication is confidential. Frontier only sends and receives email on 
the basis of the terms set out at http://www.frontier.com/email_disclaimer.


solr 5.4.1 leader issue

2018-01-08 Thread Petersen, Robert (Contr)
Hi got two out of my three servers think they are replicas on one shard getting 
exceptions wondering what is the easiest way to fix this? Can I just restart 
zookeeper across the servers? Here are the exceptions:


TY

Robi


ERROR
null
RecoveryStrategy
Error while trying to recover. 
core=custsearch_shard3_replica1:java.util.concurrent.ExecutionException: 
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
from server at http://x.x.x.x:8983/solr: We are not the leader
Error while trying to recover. 
core=custsearch_shard3_replica1:java.util.concurrent.ExecutionException: 
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
from server at http://x.x.x.x:8983/solr: We are not the leader
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at 
org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:607)
at org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:364)
at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:226)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:232)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: 
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
from server at http://10.209.55.10:8983/solr: We are not the leader
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:575)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient$1.call(HttpSolrClient.java:285)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient$1.call(HttpSolrClient.java:281)
... 5 more
(and on the one everyone thinks is the leader)
Error while trying to recover. 
core=custsearch_shard3_replica3:org.apache.solr.common.SolrException: Cloud 
state still says we are leader.
at org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:332)
at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:226)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:232)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)




This communication is confidential. Frontier only sends and receives email on 
the basis of the terms set out at http://www.frontier.com/email_disclaimer.


Re: Any Insights SOLR Rank tuning tool

2017-12-14 Thread Petersen, Robert (Contr)
I remember when FAST (when it was still FAST) came to our enterprise to pitch 
their search when we were looking to replace our alta vista search engine with 
*something* and they demonstrated that relevance tool for business side. While 
that thing was awesome, I've never seen anything close to it in the solr world 
where I ended up going instead of the soon to be doomed FAST search. Also that 
tool was totally manual and of limited use in a very large corpus/catalog. Sort 
of like just applying a bandaid to a larger problem.


Splainer will only detail the reasons things show up in one query but won't 
solve a bigger relevancy problem. On the other hand, there are several ways to 
skin this cat. There are solutions which analyze logs for outlying cases and 
feed back into solr these results to automatically improve relevancy. I don't 
think most any of these are open source and some are quite proprietary.


If your company could afford to assign a buisdev guy to tweeking individual 
searches, I'm sure they could instead get some jr devs to go over query logs 
inspecting outlying cases like zero results/too many results then look at if it 
is a data issue or a query issue. And then recommend changes in the appropriate 
domain.


Thanks

Robi


From: Charlie Hull 
Sent: Thursday, December 14, 2017 1:24:42 AM
To: solr-user@lucene.apache.org
Subject: Re: Any Insights SOLR Rank tuning tool

On 13/12/2017 20:18, Sharma, Abhinav wrote:
> Hello Folks,
>
> Currently, we are running FAST ESP as a Search System & are looking to 
> migrate from FAST ESP to SOLR.
> I was just wondering if you Guys have any built-in Relevancy tool for the 
> Business Folks like what we have in FAST called SBC (Search Business Center)?
>
> Thanks, Abhi
>
I'd second Quepid as we've used it for several projects where migration
is an issue (disclaimer: we're partners with OSC and resell Quepid).

Migration is a tricky thing to get right: the business side want the new
engine to behave like the old one, but don't understand the technical
issues when you're putting in a totally different core engine; technical
folks don't necessarily understand the business drivers behind making
the transition as painless as possible for users. Developing tests (and
being able to compare both sets of search results) is essential.
Remember that you might even have to replicate some 'wrong' behaviour of
the old engine as people are used to it!

Cheers

Charlie

--
Charlie Hull
Flax - Open Source Enterprise Search

tel/fax: +44 (0)8700 118334
mobile:  +44 (0)7767 825828
web: www.flax.co.uk



This communication is confidential. Frontier only sends and receives email on 
the basis of the terms set out at http://www.frontier.com/email_disclaimer.


Re: SOLR Rest API for monitoring

2017-12-14 Thread Petersen, Robert (Contr)
you are using cloudera? sounds like a question for them...


From: Abhi Basu <9000r...@gmail.com>
Sent: Thursday, December 14, 2017 1:27:23 PM
To: solr-user@lucene.apache.org
Subject: SOLR Rest API for monitoring

Hi All:

I am using CDH 5.13 with Solr 4.10. Trying to automate metrics gathering
for JVM (CPU, RAM, Storage etc.) by calling the REST APIs described here ->
https://lucene.apache.org/solr/guide/6_6/metrics-reporting.html.

Are these not supported in my version of Solr? If not, what option do I
have?

I tried calling this:
http://hadoop-nn2.esolocal.com:8983/solr/admin/metrics?wt=json=counter=core

And receive 404 - request not available.

Are there any configuration changes needed?

Thanks, Abhi

--
Abhi Basu



This communication is confidential. Frontier only sends and receives email on 
the basis of the terms set out at http://www.frontier.com/email_disclaimer.


Re: Solr upgrade from 4.x to 7.1

2017-12-14 Thread Petersen, Robert (Contr)
>From what I have read, you can only upgrade to the next major version number 
>without using a tool to convert the indexes to the newer version. But that is 
>still perilous due to deprications etc


So I think best advice out there is to spin up a new farm on 7.1 (especially 
from 4.x), make a new collection there, reindex everything into it and then 
switch over to the new farm. I would also ask the question are you thinking to 
go to master/slave on 7.1? Wouldn't you want to go with solr cloud?


I started with master/slave and yes it is simpler but there is that one single 
point of failure (the master) for indexing, which is of course easily manually 
overcome by purposing a slave as the new master and repointing the remaining 
slaves at the new master however this is a completely manual process you try to 
avoid in cloud mode.


I think you'd need to think this through more fully with the new possibilities 
available and how you'd want to migrate given your existing environment is so 
far behind.


Thanks

Robi


From: Drooy Drooy 
Sent: Thursday, December 14, 2017 1:27:53 PM
To: solr-user@lucene.apache.org
Subject: Solr upgrade from 4.x to 7.1

Hi All,

We have an in-house project running in Solr 4.7 with Master/Slave mode for
a few years, what is it going to take to upgrade it to SolrCloud with
TLOG/PULL replica mode ?

I read the upgrade guides, none of them talking about the jump from 4.x to
7.

Thanks much



This communication is confidential. Frontier only sends and receives email on 
the basis of the terms set out at http://www.frontier.com/email_disclaimer.


Re: Can someone help? Two level nested doc... ChildDocTransformerFactory sytax...

2017-11-07 Thread Petersen, Robert (Contr)
OK although this was talked about as possibly coming in solr 6.x I guess it was 
hearsay and from what I can tell after rereading everythying I can find on the 
subject as of now the child docs are only retrievable as a one level hierarchy 
when using the ChildDocTransformerFactory




From: Petersen, Robert (Contr) <robert.peters...@ftr.com>
Sent: Monday, November 6, 2017 5:05:31 PM
To: solr-user@lucene.apache.org
Subject: Can someone help? Two level nested doc... ChildDocTransformerFactory 
sytax...

OK no faceting, no filtering, I just want the hierarchy to come backin the 
results. Can't quite get it... googled all over the place too.


Doc:

{ id : asdf, type_s:customer, firstName_s:Manny, lastName_s:Acevedo, 
address_s:"123 Fourth Street", city_s:Gotham, tn_s:1234561234,
  _childDocuments_:[
  { id : adsf_c1,
src_s : "CRM.Customer",
type_s:customerSource,
_childDocuments_:[
{
id : asdf_c1_c1,
type_s:customerSourceType,
"key_s": "id",
"value_s": "GUID"
}
]
},
  { id : adsf_c2,
"src_s": "DPI.SalesOrder",
type_s:customerSource,
_childDocuments_:[
{
id : asdf_c2_c1,
type_s:customerSourceType,
"key_s": "btn",
"value_s": "4052328908"
},
{
id : asdf_c2_c2,
type_s:customerSourceType,
"key_s": "seq",
"value_s": "5"
   },
{
id : asdf_c2_c3,
type_s:customerSourceType,
"key_s": "env",
"value_s": "MS"
}
]
}
]
}


Queries:

Gives all nested docs regardless of level as a flat set
http://localhost:8983/solr/temptest/select?q=id:asdf=id,[child%20parentFilter=type_s:customer]

Gives all nested child docs only
http://localhost:8983/solr/temptest/select?q=id:asdf=id,[child%20parentFilter=type_s:customer%20childFilter=type_s:customerSource]

How to get nested grandchild docs at correct level?
Nope exception:
http://localhost:8983/solr/temptest/select?q=id:asdf=id,[child%20parentFilter=type_s:customer%20childFilter=type_s:customerSource],[child%20parentFilter=type_s:customerSource%20childFilter=type_s:customerSourceType]

Nope exception:
http://localhost:8983/solr/temptest/select?q=id:asdf=id,[child%20parentFilter=type_s:customer%20childFilter=type_s:customerSource],[child%20parentFilter=type_s:customerSource]


Nope but no exception only gets children again tho like above:
http://localhost:8983/solr/temptest/select?q=id:asdf=id,[child%20parentFilter=type_s:customer%20childFilter=type_s:customerSource],[child%20parentFilter=type_s:customer*]

Nope but no exception only gets children 
again:<http://localhost:8983/solr/temptest/select?q=id:asdf=id,[child%20parentFilter=type_s:customer%20childFilter=type_s:customerSource],[child%20parentFilter=type_s:customer*%20childFilter=type_s:customerSourceType]>

http://localhost:8983/solr/temptest/select?q=id:asdf=id,[child%20parentFilter=type_s:customer%20childFilter=type_s:customerSource],[child%20parentFilter=type_s:customer*%20childFilter=type_s:customerSourceType]


Nope same again... no grandchildren:

http://localhost:8983/solr/temptest/select?q=id:asdf=id,p:[child%20parentFilter=type_s:customer%20childFilter=type_s:customerSource],q:[child%20parentFilter=-type_s:customer%20parentFilter=type_s:customerSource%20childFilter=type_s:customerSourceType]


Gives all but flat no child to grandchild hierarchy:

http://localhost:8983/solr/temptest/select?q=id:asdf=id,p:[child%20parentFilter=type_s:customer%20childFilter=type_s:customerSource],q:[child%20parentFilter=type_s:customer%20childFilter=type_s:customerSourceType]


Thanks in advance,

Robi



This communication is confidential. Frontier only sends and receives email on 
the basis of the terms set out at http://www.frontier.com/email_disclaimer.


Can someone help? Two level nested doc... ChildDocTransformerFactory sytax...

2017-11-06 Thread Petersen, Robert (Contr)
OK no faceting, no filtering, I just want the hierarchy to come backin the 
results. Can't quite get it... googled all over the place too.


Doc:

{ id : asdf, type_s:customer, firstName_s:Manny, lastName_s:Acevedo, 
address_s:"123 Fourth Street", city_s:Gotham, tn_s:1234561234,
  _childDocuments_:[
  { id : adsf_c1,
src_s : "CRM.Customer",
type_s:customerSource,
_childDocuments_:[
{
id : asdf_c1_c1,
type_s:customerSourceType,
"key_s": "id",
"value_s": "GUID"
}
]
},
  { id : adsf_c2,
"src_s": "DPI.SalesOrder",
type_s:customerSource,
_childDocuments_:[
{
id : asdf_c2_c1,
type_s:customerSourceType,
"key_s": "btn",
"value_s": "4052328908"
},
{
id : asdf_c2_c2,
type_s:customerSourceType,
"key_s": "seq",
"value_s": "5"
   },
{
id : asdf_c2_c3,
type_s:customerSourceType,
"key_s": "env",
"value_s": "MS"
}
]
}
]
}


Queries:

Gives all nested docs regardless of level as a flat set
http://localhost:8983/solr/temptest/select?q=id:asdf=id,[child%20parentFilter=type_s:customer]

Gives all nested child docs only
http://localhost:8983/solr/temptest/select?q=id:asdf=id,[child%20parentFilter=type_s:customer%20childFilter=type_s:customerSource]

How to get nested grandchild docs at correct level?
Nope exception:
http://localhost:8983/solr/temptest/select?q=id:asdf=id,[child%20parentFilter=type_s:customer%20childFilter=type_s:customerSource],[child%20parentFilter=type_s:customerSource%20childFilter=type_s:customerSourceType]

Nope exception:
http://localhost:8983/solr/temptest/select?q=id:asdf=id,[child%20parentFilter=type_s:customer%20childFilter=type_s:customerSource],[child%20parentFilter=type_s:customerSource]


Nope but no exception only gets children again tho like above:
http://localhost:8983/solr/temptest/select?q=id:asdf=id,[child%20parentFilter=type_s:customer%20childFilter=type_s:customerSource],[child%20parentFilter=type_s:customer*]

Nope but no exception only gets children 
again:

http://localhost:8983/solr/temptest/select?q=id:asdf=id,[child%20parentFilter=type_s:customer%20childFilter=type_s:customerSource],[child%20parentFilter=type_s:customer*%20childFilter=type_s:customerSourceType]


Nope same again... no grandchildren:

http://localhost:8983/solr/temptest/select?q=id:asdf=id,p:[child%20parentFilter=type_s:customer%20childFilter=type_s:customerSource],q:[child%20parentFilter=-type_s:customer%20parentFilter=type_s:customerSource%20childFilter=type_s:customerSourceType]


Gives all but flat no child to grandchild hierarchy:

http://localhost:8983/solr/temptest/select?q=id:asdf=id,p:[child%20parentFilter=type_s:customer%20childFilter=type_s:customerSource],q:[child%20parentFilter=type_s:customer%20childFilter=type_s:customerSourceType]


Thanks in advance,

Robi



This communication is confidential. Frontier only sends and receives email on 
the basis of the terms set out at http://www.frontier.com/email_disclaimer.


Re: Java 9

2017-11-06 Thread Petersen, Robert (Contr)
Actually I can't believe they're depricating UseConcMarkSweepGC , That was the 
one that finally made solr 'sing' with no OOMs!


I guess they must have found something better, have to look into that...


Robi


From: Chris Hostetter 
Sent: Monday, November 6, 2017 3:07:28 PM
To: solr-user@lucene.apache.org
Subject: Re: Java 9



: Anyone else been noticing this this msg when starting up solr with java 9? 
(This is just an FYI and not a real question)

: Java HotSpot(TM) 64-Bit Server VM warning: Option UseConcMarkSweepGC was 
deprecated in version 9.0 and will likely be removed in a future release.
: Java HotSpot(TM) 64-Bit Server VM warning: Option UseParNewGC was deprecated 
in version 9.0 and will likely be removed in a future release.

IIRC the default GC_TUNE options for Solr still assume java8, but also
work fine with java9 -- although they do cause those deprecation warnings
and result in using the JVM defaults

You are free to customize this in your solr.in.sh if you are running java9 and
don't like the deprecation warnings ... and/or open a Jira w/suggestions
for what Solr's default GC_TUNE option should be when running in java9 (i
don't know if there is any community concensus on that yet -- but you're
welcome to try and build some)


-Hoss
http://www.lucidworks.com/



This communication is confidential. Frontier only sends and receives email on 
the basis of the terms set out at http://www.frontier.com/email_disclaimer.


Java 9

2017-11-06 Thread Petersen, Robert (Contr)
Hi Guys,


Anyone else been noticing this this msg when starting up solr with java 9? 
(This is just an FYI and not a real question)


Java HotSpot(TM) 64-Bit Server VM warning: Option UseConcMarkSweepGC was 
deprecated in version 9.0 and will likely be removed in a future release.
Java HotSpot(TM) 64-Bit Server VM warning: Option UseParNewGC was deprecated in 
version 9.0 and will likely be removed in a future release.


Robi



This communication is confidential. Frontier only sends and receives email on 
the basis of the terms set out at http://www.frontier.com/email_disclaimer.


Re: Anyone have any comments on current solr monitoring favorites?

2017-11-06 Thread Petersen, Robert (Contr)
Hi Walter,


OK now that sounds really interesting. I actually just turned on logging in 
Jetty and yes did see all the intra-cluster traffic there. I'm pushing our ELK 
team to pick out the get search requests across the cluster and aggregate them 
for me. We'll see how that looks but that would just be for user query analysis 
and not for real time analysis. Still looking for something to monitor real 
time since apparently my company has all it's new relic licenses tied up with 
other level one processes and doesn't want to buy any more of them at this 
time...  lol


And yes when I looked directly at the Graphite data backing Grafana at my last 
position it was just scary!


Thanks

Robi


PS early adapter for influxDB in general or just for this use case?


From: Walter Underwood <wun...@wunderwood.org>
Sent: Monday, November 6, 2017 1:44:01 PM
To: solr-user@lucene.apache.org
Subject: Re: Anyone have any comments on current solr monitoring favorites?

We use New Relic across the site, but it doesn’t split out traffic to different 
endpoints. It also cannot distinguish between search traffic to the cluster and 
intra-cluster traffic. With four shards, the total traffic is 4X bigger than 
the incoming traffic.

We have a bunch of business metrics (orders) and other stuff that is currently 
in Graphite. We’ll almost certainly move all that to InfluxDB and Grafana.

The Solr metrics were overloading the Graphite database, so we’re the first 
service that is trying InfluxDB.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)


> On Nov 6, 2017, at 1:31 PM, Petersen, Robert (Contr) 
> <robert.peters...@ftr.com> wrote:
>
> Hi Walter,
>
>
> Yes, now I see it. I'm wondering about using Grafana and New Relic at the 
> same time since New Relic has a dashboard and also costs money for corporate 
> use. I guess after a reread you are using Grafana to visualize the influxDB 
> data and New Relic just for JVM right?  Did this give you more control over 
> the solr metrics you are monitoring? (PS I've never heard of influxDB)
>
>
> Thanks
>
> Robi
>
> 
> From: Walter Underwood <wun...@wunderwood.org>
> Sent: Monday, November 6, 2017 11:26:07 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Anyone have any comments on current solr monitoring favorites?
>
> Look back down the string to my post. We use Grafana.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
>
>> On Nov 6, 2017, at 11:23 AM, Petersen, Robert (Contr) 
>> <robert.peters...@ftr.com> wrote:
>>
>> Interesting! Finally a Grafana user... Thanks Daniel, I will follow your 
>> links. That looks promising.
>>
>>
>> Is anyone using Grafana over Graphite?
>>
>>
>> Thanks
>>
>> Robi
>>
>> 
>> From: Daniel Ortega <danielortegauf...@gmail.com>
>> Sent: Monday, November 6, 2017 11:19:10 AM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Anyone have any comments on current solr monitoring favorites?
>>
>> Hi Robert,
>>
>> We use the following stack:
>>
>> - Prometheus to scrape metrics (https://prometheus.io/)
>> - Prometheus node exporter to export "machine metrics" (Disk, network
>> usage, etc.) (https://github.com/prometheus/node_exporter)
>> - Prometheus JMX exporter to export "Solr metrics" (Cache usage, QPS,
>> Response times...) (https://github.com/prometheus/jmx_exporter)
>> - Grafana to visualize all the data scrapped by Prometheus (
>> https://grafana.com/)
>>
>> Best regards
>> Daniel Ortega
>>
>> 2017-11-06 20:13 GMT+01:00 Petersen, Robert (Contr) <
>> robert.peters...@ftr.com>:
>>
>>> PS I knew sematext would be required to chime in here!  
>>>
>>>
>>> Is there a non-expiring dev version I could experiment with? I think I did
>>> sign up for a trial years ago from a different company... I was actually
>>> wondering about hooking it up to my personal AWS based solr cloud instance.
>>>
>>>
>>> Thanks
>>>
>>> Robi
>>>
>>> 
>>> From: Emir Arnautović <emir.arnauto...@sematext.com>
>>> Sent: Thursday, November 2, 2017 2:05:10 PM
>>> To: solr-user@lucene.apache.org
>>> Subject: Re: Anyone have any comments on current solr monitoring favorites?
>>>
>>> Hi Robi,
>>> Did you try Sematext’s SPM? It provides host, JVM and Solr metrics and
>>> more. We use it for monitoring 

Re: Anyone have any comments on current solr monitoring favorites?

2017-11-06 Thread Petersen, Robert (Contr)
Hi Walter,


Yes, now I see it. I'm wondering about using Grafana and New Relic at the same 
time since New Relic has a dashboard and also costs money for corporate use. I 
guess after a reread you are using Grafana to visualize the influxDB data and 
New Relic just for JVM right?  Did this give you more control over the solr 
metrics you are monitoring? (PS I've never heard of influxDB)


Thanks

Robi


From: Walter Underwood <wun...@wunderwood.org>
Sent: Monday, November 6, 2017 11:26:07 AM
To: solr-user@lucene.apache.org
Subject: Re: Anyone have any comments on current solr monitoring favorites?

Look back down the string to my post. We use Grafana.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)


> On Nov 6, 2017, at 11:23 AM, Petersen, Robert (Contr) 
> <robert.peters...@ftr.com> wrote:
>
> Interesting! Finally a Grafana user... Thanks Daniel, I will follow your 
> links. That looks promising.
>
>
> Is anyone using Grafana over Graphite?
>
>
> Thanks
>
> Robi
>
> 
> From: Daniel Ortega <danielortegauf...@gmail.com>
> Sent: Monday, November 6, 2017 11:19:10 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Anyone have any comments on current solr monitoring favorites?
>
> Hi Robert,
>
> We use the following stack:
>
> - Prometheus to scrape metrics (https://prometheus.io/)
> - Prometheus node exporter to export "machine metrics" (Disk, network
> usage, etc.) (https://github.com/prometheus/node_exporter)
> - Prometheus JMX exporter to export "Solr metrics" (Cache usage, QPS,
> Response times...) (https://github.com/prometheus/jmx_exporter)
> - Grafana to visualize all the data scrapped by Prometheus (
> https://grafana.com/)
>
> Best regards
> Daniel Ortega
>
> 2017-11-06 20:13 GMT+01:00 Petersen, Robert (Contr) <
> robert.peters...@ftr.com>:
>
>> PS I knew sematext would be required to chime in here!  
>>
>>
>> Is there a non-expiring dev version I could experiment with? I think I did
>> sign up for a trial years ago from a different company... I was actually
>> wondering about hooking it up to my personal AWS based solr cloud instance.
>>
>>
>> Thanks
>>
>> Robi
>>
>> 
>> From: Emir Arnautović <emir.arnauto...@sematext.com>
>> Sent: Thursday, November 2, 2017 2:05:10 PM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Anyone have any comments on current solr monitoring favorites?
>>
>> Hi Robi,
>> Did you try Sematext’s SPM? It provides host, JVM and Solr metrics and
>> more. We use it for monitoring our Solr instances and for consulting.
>>
>> Disclaimer - see signature :)
>>
>> Emir
>> --
>> Monitoring - Log Management - Alerting - Anomaly Detection
>> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>>
>>
>>
>>> On 2 Nov 2017, at 19:35, Walter Underwood <wun...@wunderwood.org> wrote:
>>>
>>> We use New Relic for JVM, CPU, and disk monitoring.
>>>
>>> I tried the built-in metrics support in 6.4, but it just didn’t do what
>> we want. We want rates and percentiles for each request handler. That gives
>> us 95th percentile for textbooks suggest or for homework search results
>> page, etc. The Solr metrics didn’t do that. The Jetty metrics didn’t do
>> that.
>>>
>>> We built a dedicated servlet filter that goes in front of the Solr
>> webapp and reports metrics. It has some special hacks to handle some weird
>> behavior in SolrJ. A request to the “/srp” handler is sent as
>> “/select?qt=/srp”, so we normalize that.
>>>
>>> The metrics start with the cluster name, the hostname, and the
>> collection. The rest is generated like this:
>>>
>>> URL: GET /solr/textbooks/select?q=foo=/auto
>>> Metric: textbooks.GET./auto
>>>
>>> URL: GET /solr/textbooks/select?q=foo
>>> Metric: textbooks.GET./select
>>>
>>> URL: GET /solr/questions/auto
>>> Metric: questions.GET./auto
>>>
>>> So a full metric for the cluster “solr-cloud” and the host “search01"
>> would look like “solr-cloud.search01.solr.textbooks.GET./auto.m1_rate”.
>>>
>>> We send all that to InfluxDB. We’ve configured a template so that each
>> part of the metric name is mapped to a field, so we can write efficient
>> queries in InfluxQL.
>>>
>>> Metrics are graphed in Grafana. We have dashboards that mix Cloudwatch
>> (for the load balancer) 

Re: Anyone have any comments on current solr monitoring favorites?

2017-11-06 Thread Petersen, Robert (Contr)
Interesting! Finally a Grafana user... Thanks Daniel, I will follow your links. 
That looks promising.


Is anyone using Grafana over Graphite?


Thanks

Robi


From: Daniel Ortega <danielortegauf...@gmail.com>
Sent: Monday, November 6, 2017 11:19:10 AM
To: solr-user@lucene.apache.org
Subject: Re: Anyone have any comments on current solr monitoring favorites?

Hi Robert,

We use the following stack:

- Prometheus to scrape metrics (https://prometheus.io/)
- Prometheus node exporter to export "machine metrics" (Disk, network
usage, etc.) (https://github.com/prometheus/node_exporter)
- Prometheus JMX exporter to export "Solr metrics" (Cache usage, QPS,
Response times...) (https://github.com/prometheus/jmx_exporter)
- Grafana to visualize all the data scrapped by Prometheus (
https://grafana.com/)

Best regards
Daniel Ortega

2017-11-06 20:13 GMT+01:00 Petersen, Robert (Contr) <
robert.peters...@ftr.com>:

> PS I knew sematext would be required to chime in here!  
>
>
> Is there a non-expiring dev version I could experiment with? I think I did
> sign up for a trial years ago from a different company... I was actually
> wondering about hooking it up to my personal AWS based solr cloud instance.
>
>
> Thanks
>
> Robi
>
> 
> From: Emir Arnautović <emir.arnauto...@sematext.com>
> Sent: Thursday, November 2, 2017 2:05:10 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Anyone have any comments on current solr monitoring favorites?
>
> Hi Robi,
> Did you try Sematext’s SPM? It provides host, JVM and Solr metrics and
> more. We use it for monitoring our Solr instances and for consulting.
>
> Disclaimer - see signature :)
>
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
> > On 2 Nov 2017, at 19:35, Walter Underwood <wun...@wunderwood.org> wrote:
> >
> > We use New Relic for JVM, CPU, and disk monitoring.
> >
> > I tried the built-in metrics support in 6.4, but it just didn’t do what
> we want. We want rates and percentiles for each request handler. That gives
> us 95th percentile for textbooks suggest or for homework search results
> page, etc. The Solr metrics didn’t do that. The Jetty metrics didn’t do
> that.
> >
> > We built a dedicated servlet filter that goes in front of the Solr
> webapp and reports metrics. It has some special hacks to handle some weird
> behavior in SolrJ. A request to the “/srp” handler is sent as
> “/select?qt=/srp”, so we normalize that.
> >
> > The metrics start with the cluster name, the hostname, and the
> collection. The rest is generated like this:
> >
> > URL: GET /solr/textbooks/select?q=foo=/auto
> > Metric: textbooks.GET./auto
> >
> > URL: GET /solr/textbooks/select?q=foo
> > Metric: textbooks.GET./select
> >
> > URL: GET /solr/questions/auto
> > Metric: questions.GET./auto
> >
> > So a full metric for the cluster “solr-cloud” and the host “search01"
> would look like “solr-cloud.search01.solr.textbooks.GET./auto.m1_rate”.
> >
> > We send all that to InfluxDB. We’ve configured a template so that each
> part of the metric name is mapped to a field, so we can write efficient
> queries in InfluxQL.
> >
> > Metrics are graphed in Grafana. We have dashboards that mix Cloudwatch
> (for the load balancer) and InfluxDB.
> >
> > I’m still working out the kinks in some of the more complicated queries,
> but the data is all there. I also want to expand the servlet filter to
> report HTTP response codes.
> >
> > wunder
> > Walter Underwood
> > wun...@wunderwood.org
> > http://observer.wunderwood.org/  (my blog)
> >
> >
> >> On Nov 2, 2017, at 9:30 AM, Petersen, Robert (Contr) <
> robert.peters...@ftr.com> wrote:
> >>
> >> OK I'm probably going to open a can of worms here...  lol
> >>
> >>
> >> In the old old days I used PSI probe to monitor solr running on tomcat
> which worked ok on a machine by machine basis.
> >>
> >>
> >> Later I had a grafana dashboard on top of graphite monitoring which was
> really nice looking but kind of complicated to set up.
> >>
> >>
> >> Even later I successfully just dropped in a newrelic java agent which
> had solr monitors and a dashboard right out of the box, but it costs money
> for the full tamale.
> >>
> >>
> >> For basic JVM health and Solr QPS and time percentiles, does anyone
> have any favorites or other alternative suggestions?
> >>
> >>
> >> Thanks in advance!
> >>
> >> Robi
> >>
> >> 
> >>
> >> This communication is confidential. Frontier only sends and receives
> email on the basis of the terms set out at http://www.frontier.com/email_
> disclaimer.
> >
>
>



This communication is confidential. Frontier only sends and receives email on 
the basis of the terms set out at http://www.frontier.com/email_disclaimer.


Re: Anyone have any comments on current solr monitoring favorites?

2017-11-06 Thread Petersen, Robert (Contr)
PS I knew sematext would be required to chime in here!  


Is there a non-expiring dev version I could experiment with? I think I did sign 
up for a trial years ago from a different company... I was actually wondering 
about hooking it up to my personal AWS based solr cloud instance.


Thanks

Robi


From: Emir Arnautović <emir.arnauto...@sematext.com>
Sent: Thursday, November 2, 2017 2:05:10 PM
To: solr-user@lucene.apache.org
Subject: Re: Anyone have any comments on current solr monitoring favorites?

Hi Robi,
Did you try Sematext’s SPM? It provides host, JVM and Solr metrics and more. We 
use it for monitoring our Solr instances and for consulting.

Disclaimer - see signature :)

Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 2 Nov 2017, at 19:35, Walter Underwood <wun...@wunderwood.org> wrote:
>
> We use New Relic for JVM, CPU, and disk monitoring.
>
> I tried the built-in metrics support in 6.4, but it just didn’t do what we 
> want. We want rates and percentiles for each request handler. That gives us 
> 95th percentile for textbooks suggest or for homework search results page, 
> etc. The Solr metrics didn’t do that. The Jetty metrics didn’t do that.
>
> We built a dedicated servlet filter that goes in front of the Solr webapp and 
> reports metrics. It has some special hacks to handle some weird behavior in 
> SolrJ. A request to the “/srp” handler is sent as “/select?qt=/srp”, so we 
> normalize that.
>
> The metrics start with the cluster name, the hostname, and the collection. 
> The rest is generated like this:
>
> URL: GET /solr/textbooks/select?q=foo=/auto
> Metric: textbooks.GET./auto
>
> URL: GET /solr/textbooks/select?q=foo
> Metric: textbooks.GET./select
>
> URL: GET /solr/questions/auto
> Metric: questions.GET./auto
>
> So a full metric for the cluster “solr-cloud” and the host “search01" would 
> look like “solr-cloud.search01.solr.textbooks.GET./auto.m1_rate”.
>
> We send all that to InfluxDB. We’ve configured a template so that each part 
> of the metric name is mapped to a field, so we can write efficient queries in 
> InfluxQL.
>
> Metrics are graphed in Grafana. We have dashboards that mix Cloudwatch (for 
> the load balancer) and InfluxDB.
>
> I’m still working out the kinks in some of the more complicated queries, but 
> the data is all there. I also want to expand the servlet filter to report 
> HTTP response codes.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
>
>> On Nov 2, 2017, at 9:30 AM, Petersen, Robert (Contr) 
>> <robert.peters...@ftr.com> wrote:
>>
>> OK I'm probably going to open a can of worms here...  lol
>>
>>
>> In the old old days I used PSI probe to monitor solr running on tomcat which 
>> worked ok on a machine by machine basis.
>>
>>
>> Later I had a grafana dashboard on top of graphite monitoring which was 
>> really nice looking but kind of complicated to set up.
>>
>>
>> Even later I successfully just dropped in a newrelic java agent which had 
>> solr monitors and a dashboard right out of the box, but it costs money for 
>> the full tamale.
>>
>>
>> For basic JVM health and Solr QPS and time percentiles, does anyone have any 
>> favorites or other alternative suggestions?
>>
>>
>> Thanks in advance!
>>
>> Robi
>>
>> 
>>
>> This communication is confidential. Frontier only sends and receives email 
>> on the basis of the terms set out at 
>> http://www.frontier.com/email_disclaimer.
>



String payloads...

2017-11-06 Thread Petersen, Robert (Contr)
Hi Guys,


I was playing with payloads example as I had a possible use case of alternate 
product titles for a product.

https://lucidworks.com/2017/09/14/solr-payloads/

bin/solr start
bin/solr create -c payloads
bin/post -c payloads -type text/csv -out yes -d $'id,vals_dpf\n1,one|1.0 
two|2.0 three|3.0\n2,weig...

I saw you could do this:

http://localhost:8983/solr/payloads/query?q=*:*=csv=id,p:payload(vals_dpf,three)
id,p
1,3.0
2,0.0

So I wanted to do something similar wiht strings and so I loaded solr with


./post -c payloads -type text/csv -out yes -d 
$'id,vals_dps\n1,one|thisisastring two|"this is a string" three|hi\n2,j
son|{asdf:123}'


http://localhost:8983/solr/payloads/query?q=vals_dps:json


[{"id":"2","vals_dps":"json|{asdf:123}","_version_":1583284597287813000}]


OK so here is my question, it seems like the payload function only works 
against numeric payloads. Further I can't see a way to get the payload to come 
out alone without the field value attached. What I would like is something like 
this, is this possible in any way? I know it would be easy enough to do some 
post query processing in a service layer but... just wondering about this. It 
seems like I should be able to get at the payload when it is a string.


http://localhost:8983/solr/payloads/query?q=vals_dps:json=id,p:payloadvalue(vals_dpf,
 json)


[{"id":"2","p":"{asdf:123}","_version_":1583284597287813000}]

Thanks

Robi




This communication is confidential. Frontier only sends and receives email on 
the basis of the terms set out at http://www.frontier.com/email_disclaimer.


Anyone have any comments on current solr monitoring favorites?

2017-11-02 Thread Petersen, Robert (Contr)
OK I'm probably going to open a can of worms here...  lol


In the old old days I used PSI probe to monitor solr running on tomcat which 
worked ok on a machine by machine basis.


Later I had a grafana dashboard on top of graphite monitoring which was really 
nice looking but kind of complicated to set up.


Even later I successfully just dropped in a newrelic java agent which had solr 
monitors and a dashboard right out of the box, but it costs money for the full 
tamale.


For basic JVM health and Solr QPS and time percentiles, does anyone have any 
favorites or other alternative suggestions?


Thanks in advance!

Robi



This communication is confidential. Frontier only sends and receives email on 
the basis of the terms set out at http://www.frontier.com/email_disclaimer.


Re: Upgrade path from 5.4.1

2017-11-02 Thread Petersen, Robert (Contr)
Thanks guys! I kind of suspected this would be the best route and I'll move 
forward with a fresh start on 7.x as soon as I can get ops to give me the 
needed machines! 


Best

Robi


From: Erick Erickson 
Sent: Thursday, November 2, 2017 8:17:49 AM
To: solr-user
Subject: Re: Upgrade path from 5.4.1

Yonik:

Yeah, I was justparroting what had been reported I have no data to
back it up personally. I just saw the JIRA that Simon indicated and it
looks like the statement "which are faster on all fronts and use less
memory" is just flat wrong when it comes to looking up individual
values.

Ya learn somethin' new every day.

On Thu, Nov 2, 2017 at 6:57 AM, simon  wrote:
> though see SOLR-11078 , which is reporting significant query slowdowns
> after converting  *Trie to *Point fields in 7.1, compared with 6.4.2
>
> On Wed, Nov 1, 2017 at 9:06 PM, Yonik Seeley  wrote:
>
>> On Wed, Nov 1, 2017 at 2:36 PM, Erick Erickson 
>> wrote:
>> > I _always_ prefer to reindex if possible. Additionally, as of Solr 7
>> > all the numeric types are deprecated in favor of points-based types
>> > which are faster on all fronts and use less memory.
>>
>> They are a good step forward in genera, and faster for range queries
>> (and multiple-dimensions), but looking at the design I'd guess that
>> they may be slower for exact-match queries?
>> Has anyone tested this?
>>
>> -Yonik
>>



This communication is confidential. Frontier only sends and receives email on 
the basis of the terms set out at http://www.frontier.com/email_disclaimer.


Upgrade path from 5.4.1

2017-11-01 Thread Petersen, Robert (Contr)
Hi Guys,


I just took over the care and feeding of three poor neglected solr 5.4.1 cloud 
clusters at my new position. While spinning up new collections and supporting 
other business initiatives I am pushing management to give me the green light 
on migrating to a newer version of solr. The last solr I worked with was 6.6.1 
and I was thinking of doing an upgrade to that (er actually 6.6.2) as I was 
reading an existing index only upgrades one major version number at a time.


Then I realized the existing 5.4.1 cloud clusters here were set up with 
unmanaged configs, so now I'm starting to lean toward just spinning up clean 
new 6.6.2 or 7.1 clouds on new machines leaving the existing 5.4.1 machines in 
place then reindexing everything on to the new machines with the intention of 
testing and then swapping in the new machines and finally destroying the old 
ones when the dust settles (they're all virtuals so NP just destroying the old 
instances and recovering their resources).


Thoughts?


Thanks

Robi



This communication is confidential. Frontier only sends and receives email on 
the basis of the terms set out at http://www.frontier.com/email_disclaimer.