[jira] [Commented] (SOLR-12415) Solr Loadbalancer client LBHttpSolrClient not working as expected, if a Solr node goes down, it is unable to detect when it become live again due to 404 error

2018-06-14 Thread Grzegorz Lebek (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512448#comment-16512448
 ] 

Grzegorz Lebek commented on SOLR-12415:
---

[~elyograg]

There is even more troubles there.

If anyone is using authentication with the user, which is a request parameter, 
the query in the method 'checkZombieServer' will always fail (as it is not 
passing auth params) and assume that the server is still down. It is one more 
reason why zombie server will never come back alive in the client.

> Solr Loadbalancer client LBHttpSolrClient not working as expected, if a Solr 
> node goes down, it is unable to detect when it become live again due to 404 
> error
> --
>
> Key: SOLR-12415
> URL: https://issues.apache.org/jira/browse/SOLR-12415
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Affects Versions: 7.2.1, 7.3.1, 7.4
> Environment: Solr 7.2.1
> 2 servers - master and slave.
>Reporter: Grzegorz Lebek
>Priority: Critical
>
> *Context*
>  When LBHttpSolrClient has been constructed using *base root urls*, and when 
> a slave goes down, and then back again, the client is unable to mark it as 
> alive again due to 404 error.
> Logs  below:
> {code:java}
>  DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 >> "GET 
> /solr/select?q=%3A=0=docid+asc=false=javabin=2 
> HTTP/1.1[\r][\n]"
>  DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 >> 
> "User-Agent: Solr[org.apache.solr.client.solrj.impl.HttpSolrClient] 
> 1.0[\r][\n]"
>  DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 >> "Host: 
> localhost:8984[\r][\n]"
>  DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 >> 
> "Connection: Keep-Alive[\r][\n]"
>  DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 >> "[\r][\n]"
>  DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << "HTTP/1.1 
> 404 Not Found[\r][\n]"
>  DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << 
> "Cache-Control: must-revalidate,no-cache,no-store[\r][\n]"
>  DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << 
> "Content-Type: text/html;charset=iso-8859-1[\r][\n]"
>  DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << 
> "Content-Length: 243[\r][\n]"
>  DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << "[\r][\n]"
>  DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << "[\n]"
>  DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << "[\n]"
>  DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << " http-equiv="Content-Type" content="text/html;charset=utf-8"/>[\n]"
>  DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << 
> "Error 404 Not Found[\n]"
>  DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << 
> "[\n]"
>  DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << 
> "HTTP ERROR 404[\n]"
>  DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << "Problem 
> accessing /solr/select. Reason:[\n]"
>  DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << " Not 
> Found[\n]"
>  DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << 
> "[\n]"
>  DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << 
> "[\n]"{code}
> *Analysis*
>  when using only *base root urls* in a LBHttpSolrClient we need to pass a 
> "*collection*" paramter when sending a request. It works fine except that in 
> a method 
> {code:java}
> private void checkAZombieServer(ServerWrapper zombieServer){code}
> it tries to query a solr without the collection parameter, to check if the 
> server is alive. This causes a html content (apparently dashboard) to be 
> returned, and as a result it will move to the exception clause in the method 
> therefore even if the server is back it will never be marked as alive again.
>  I debugged this and if we pass a collection name there as a second param it 
> will respond in a right manner.
> Suggestion is either to somehow pass the collection name or to change the way 
> zombie servers are pinged.
> *Steps to reproduce*
> Run 2 servers - master and slave. Create client using base urls. Index, test 
> search etc.
> Turn off slave server and after couple of seconds turn it on again.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12415) Solr Loadbalancer client LBHttpSolrClient not working as expected, if a Solr node goes down, it is unable to detect when it become live again due to 404 error

2018-05-30 Thread Grzegorz Lebek (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16494932#comment-16494932
 ] 

Grzegorz Lebek commented on SOLR-12415:
---

[~elyograg]

thanks for picking this up so quickly. Let me just add my comments.

bq. Move 'setDefaultCollection' from CloudSolrClient to SolrClient, fix any 
problems that causes, and require setDefaultCollection for LBHttpSolrClient to 
work properly.
This may limit flexibility of the client that was created without any core 
information.
 
bq. Have LBHttpSolrClient make a CoreAdmin call to get a list of valid cores 
and choose one for the zombie server check - but only if setDefaultCollection 
was not used.
Maybe this would work. But! Isn't a call to get list of valid cores enough to 
know that server is not a zombie? What if there is no core at all?


> Solr Loadbalancer client LBHttpSolrClient not working as expected, if a Solr 
> node goes down, it is unable to detect when it become live again due to 404 
> error
> --
>
> Key: SOLR-12415
> URL: https://issues.apache.org/jira/browse/SOLR-12415
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Affects Versions: 7.2.1, 7.3.1, 7.4
> Environment: Solr 7.2.1
> 2 servers - master and slave.
>Reporter: Grzegorz Lebek
>Priority: Critical
>
> *Context*
>  When LBHttpSolrClient has been constructed using *base root urls*, and when 
> a slave goes down, and then back again, the client is unable to mark it as 
> alive again due to 404 error.
> Logs  below:
> {code:java}
>  DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 >> "GET 
> /solr/select?q=%3A=0=docid+asc=false=javabin=2 
> HTTP/1.1[\r][\n]"
>  DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 >> 
> "User-Agent: Solr[org.apache.solr.client.solrj.impl.HttpSolrClient] 
> 1.0[\r][\n]"
>  DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 >> "Host: 
> localhost:8984[\r][\n]"
>  DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 >> 
> "Connection: Keep-Alive[\r][\n]"
>  DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 >> "[\r][\n]"
>  DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << "HTTP/1.1 
> 404 Not Found[\r][\n]"
>  DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << 
> "Cache-Control: must-revalidate,no-cache,no-store[\r][\n]"
>  DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << 
> "Content-Type: text/html;charset=iso-8859-1[\r][\n]"
>  DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << 
> "Content-Length: 243[\r][\n]"
>  DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << "[\r][\n]"
>  DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << "[\n]"
>  DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << "[\n]"
>  DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << " http-equiv="Content-Type" content="text/html;charset=utf-8"/>[\n]"
>  DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << 
> "Error 404 Not Found[\n]"
>  DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << 
> "[\n]"
>  DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << 
> "HTTP ERROR 404[\n]"
>  DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << "Problem 
> accessing /solr/select. Reason:[\n]"
>  DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << " Not 
> Found[\n]"
>  DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << 
> "[\n]"
>  DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << 
> "[\n]"{code}
> *Analysis*
>  when using only *base root urls* in a LBHttpSolrClient we need to pass a 
> "*collection*" paramter when sending a request. It works fine except that in 
> a method 
> {code:java}
> private void checkAZombieServer(ServerWrapper zombieServer){code}
> it tries to query a solr without the collection parameter, to check if the 
> server is alive. This causes a html content (apparently dashboard) to be 
> returned, and as a result it will move to the exception clause in the method 
> therefore even if the server is back it will never be marked as alive again.
>  I debugged this and if we pass a collection name there as a second param it 
> will respond in a right manner.
> Suggestion is either to somehow pass the collection name or to change the way 
> zombie servers are pinged.
> *Steps to reproduce*
> Run 2 servers - master and slave. Create client using base urls. Index, test 
> search etc.
> Turn off slave server and after couple of seconds turn it on again.
>  



--
This message was sent by Atlassian JIRA

[jira] [Updated] (SOLR-12415) Solr Loadbalancer client LBHttpSolrClient not working as expected, if a Solr node goes down, it is unable to detect when it become live again due to 404 error

2018-05-29 Thread Grzegorz Lebek (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-12415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grzegorz Lebek updated SOLR-12415:
--
Description: 
*Context*
 When LBHttpSolrClient has been constructed using *base root urls*, and when a 
slave goes down, and then back again, the client is unable to mark it as alive 
again due to 404 error.

Logs  below:
{code:java}
 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 >> "GET 
/solr/select?q=%3A=0=docid+asc=false=javabin=2 
HTTP/1.1[\r][\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 >> "User-Agent: 
Solr[org.apache.solr.client.solrj.impl.HttpSolrClient] 1.0[\r][\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 >> "Host: 
localhost:8984[\r][\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 >> "Connection: 
Keep-Alive[\r][\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 >> "[\r][\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << "HTTP/1.1 404 
Not Found[\r][\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << 
"Cache-Control: must-revalidate,no-cache,no-store[\r][\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << 
"Content-Type: text/html;charset=iso-8859-1[\r][\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << 
"Content-Length: 243[\r][\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << "[\r][\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << "[\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << "[\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << "[\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << "Error 
404 Not Found[\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << "[\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << 
"HTTP ERROR 404[\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << "Problem 
accessing /solr/select. Reason:[\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << " Not 
Found[\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << "[\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << 
"[\n]"{code}
*Analysis*
 when using only *base root urls* in a LBHttpSolrClient we need to pass a 
"*collection*" paramter when sending a request. It works fine except that in a 
method 
{code:java}
private void checkAZombieServer(ServerWrapper zombieServer){code}
it tries to query a solr without the collection parameter, to check if the 
server is alive. This causes a html content (apparently dashboard) to be 
returned, and as a result it will move to the exception clause in the method 
therefore even if the server is back it will never be marked as alive again.
 I debugged this and if we pass a collection name there as a second param it 
will respond in a right manner.

Suggestion is either to somehow pass the collection name or to change the way 
zombie servers are pinged.

*Steps to reproduce*

Run 2 servers - master and slave. Create client using base urls. Index, test 
search etc.

Turn off slave server and after couple of seconds turn it on again.

 

  was:
*Context*
When LBHttpSolrClient has been constructed using *base urls*, and when a slave 
goes down, and then back again, the client is unable to mark it as alive again 
due to 404 error.

Logs  below:
{code:java}
 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 >> "GET 
/solr/select?q=%3A=0=docid+asc=false=javabin=2 
HTTP/1.1[\r][\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 >> "User-Agent: 
Solr[org.apache.solr.client.solrj.impl.HttpSolrClient] 1.0[\r][\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 >> "Host: 
localhost:8984[\r][\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 >> "Connection: 
Keep-Alive[\r][\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 >> "[\r][\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << "HTTP/1.1 404 
Not Found[\r][\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << 
"Cache-Control: must-revalidate,no-cache,no-store[\r][\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << 
"Content-Type: text/html;charset=iso-8859-1[\r][\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << 
"Content-Length: 243[\r][\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << "[\r][\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << "[\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << "[\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << "[\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << "Error 
404 Not Found[\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] 

[jira] [Created] (SOLR-12415) Solr Loadbalancer client LBHttpSolrClient not working as expected, if a Solr node goes down, it is unable to detect when it become live again due to 404 error

2018-05-29 Thread Grzegorz Lebek (JIRA)
Grzegorz Lebek created SOLR-12415:
-

 Summary: Solr Loadbalancer client LBHttpSolrClient not working as 
expected, if a Solr node goes down, it is unable to detect when it become live 
again due to 404 error
 Key: SOLR-12415
 URL: https://issues.apache.org/jira/browse/SOLR-12415
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: SolrJ
Affects Versions: 7.3.1, 7.2.1, 7.4
 Environment: Solr 7.2.1

2 servers - master and slave.
Reporter: Grzegorz Lebek


*Context*
When LBHttpSolrClient has been constructed using *base urls*, and when a slave 
goes down, and then back again, the client is unable to mark it as alive again 
due to 404 error.

Logs  below:
{code:java}
 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 >> "GET 
/solr/select?q=%3A=0=docid+asc=false=javabin=2 
HTTP/1.1[\r][\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 >> "User-Agent: 
Solr[org.apache.solr.client.solrj.impl.HttpSolrClient] 1.0[\r][\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 >> "Host: 
localhost:8984[\r][\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 >> "Connection: 
Keep-Alive[\r][\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 >> "[\r][\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << "HTTP/1.1 404 
Not Found[\r][\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << 
"Cache-Control: must-revalidate,no-cache,no-store[\r][\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << 
"Content-Type: text/html;charset=iso-8859-1[\r][\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << 
"Content-Length: 243[\r][\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << "[\r][\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << "[\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << "[\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << "[\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << "Error 
404 Not Found[\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << "[\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << 
"HTTP ERROR 404[\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << "Problem 
accessing /solr/select. Reason:[\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << " Not 
Found[\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << "[\n]"

 DEBUG [aliveCheckExecutor-1-thread-1] [wire] http-outgoing-83 << 
"[\n]"{code}

*Analysis*
when using only *base urls* in a LBHttpSolrClient we need to pass a 
"*collection*" paramter when sending a request. It works fine except that in a 
method 
{code:java}
private void checkAZombieServer(ServerWrapper zombieServer){code}
it tries to query a solr without the collection parameter, to check if the 
server is alive. This causes a html content (apparently dashboard) to be 
returned, and as a result it will move to the exception clause in the method 
therefore even if the server is back it will never be marked as alive again.
I debugged this and if we pass a collection name there as a second param it 
will respond in a right manner.

Suggestion is either to somehow pass the collection name or to change the way 
zombie servers are pinged.

*Steps to reproduce*

Run 2 servers - master and slave. Create client using base urls. Index, test 
search etc.

Turn off slave server and after couple of seconds turn it on again.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org