Re: SolrCloud installation troubles...

2018-01-29 Thread Rick Leir
SELinux? Number open File limits? Number of Process limits? 
-- 
Sorry for being brief. Alternate email is rickleir at yahoo dot com 

Re: SolrCloud installation troubles...

2018-01-29 Thread Scott Prentice


On 1/29/18 1:31 PM, Shawn Heisey wrote:

On 1/29/2018 2:02 PM, Scott Prentice wrote:
Thanks, Shawn. I was wondering if there was something going on with 
IP redirection that was causing confusion. Any thoughts on how to 
debug? And, what do you mean by "extreme garbage collection pauses"? 
Is that Solr garbage collection or the OS itself? There's really 
nothing happening on this machine, it's purely for testing so there 
shouldn't be any extra load from other processes. 


Garbage collection is one of the primary features of Java's memory 
management.  It's not Solr or the OS.


If the java heap is really enormous, you can end up with long pauses, 
but I wouldn't expect them to be frequent unless the index is also 
really huge.


A very common issue that can cause even worse pause issues than a 
large heap is a heap that's too small, but not quite small enough to 
cause Java to completely run out of heap memory.  The default max heap 
size in recent Solr versions is 512MB, which is very small.  A Java 
program (which Solr is) can never use more heap memory than the 
maximum it is configured with, even if the machine has more memory 
available.


This paragraph is included because you mentioned IP redirection: 
Extreme care must be used when setting up SolrCloud on virtual 
machines where accessing the VM has to go through any kind of IP 
translation.  SolrCloud keeps track of how to reach each server in the 
cloud and if it stores an untranslated address when you need the 
translated address (or vice-versa), things are not going to work.  
Generally speaking translated addresses are going to be problematic 
for SolrCloud, and should not be used.


Thanks,
Shawn

Thanks for the clarification. Yes, we're just using the default heap 
size for Solr, but there's no index (yet) and nothing really going on, 
so I'd hope that garbage collection isn't the problem.


I'm putting my money on some IP translation issues (this is on a tightly 
controlled corporate network) or the fact that the 2888 and 2890 ports 
appear to not be open. I'll dig down the network issue path for now and 
see where that gets me.


Thanks,
...scott




Re: SolrCloud installation troubles...

2018-01-29 Thread Shawn Heisey

On 1/29/2018 2:02 PM, Scott Prentice wrote:
Thanks, Shawn. I was wondering if there was something going on with IP 
redirection that was causing confusion. Any thoughts on how to debug? 
And, what do you mean by "extreme garbage collection pauses"? Is that 
Solr garbage collection or the OS itself? There's really nothing 
happening on this machine, it's purely for testing so there shouldn't 
be any extra load from other processes. 


Garbage collection is one of the primary features of Java's memory 
management.  It's not Solr or the OS.


If the java heap is really enormous, you can end up with long pauses, 
but I wouldn't expect them to be frequent unless the index is also 
really huge.


A very common issue that can cause even worse pause issues than a large 
heap is a heap that's too small, but not quite small enough to cause 
Java to completely run out of heap memory.  The default max heap size in 
recent Solr versions is 512MB, which is very small.  A Java program 
(which Solr is) can never use more heap memory than the maximum it is 
configured with, even if the machine has more memory available.


This paragraph is included because you mentioned IP redirection:  
Extreme care must be used when setting up SolrCloud on virtual machines 
where accessing the VM has to go through any kind of IP translation.  
SolrCloud keeps track of how to reach each server in the cloud and if it 
stores an untranslated address when you need the translated address (or 
vice-versa), things are not going to work.  Generally speaking 
translated addresses are going to be problematic for SolrCloud, and 
should not be used.


Thanks,
Shawn



Re: SolrCloud installation troubles...

2018-01-29 Thread Scott Prentice
Looks like 2888 and 2890 are not open. At least they are not reported 
with a netstat -plunt .. could be the problem.


Thanks, all!

...scott


On 1/29/18 1:10 PM, Davis, Daniel (NIH/NLM) [C] wrote:

Trying 127.0.0.1 could help.   We kind of tend to think localhost is always 
127.0.0.1, but I've seen localhost start to resolve to ::1, the IPv6 equivalent 
of 127.0.0.1.

I guess some environments can be strict enough to restrict communication on 
localhost; seems hard to imagine, but it does happen.

-Original Message-
From: Scott Prentice [mailto:s...@leximation.com]
Sent: Monday, January 29, 2018 4:02 PM
To: solr-user@lucene.apache.org
Subject: Re: SolrCloud installation troubles...


On 1/29/18 12:44 PM, Shawn Heisey wrote:

On 1/29/2018 1:13 PM, Scott Prentice wrote:

But when I do the same thing on the Red Hat system it fails. Through
the UI, it'll first time out with this message ..

     Connection to Solr lost

Then after a refresh, the collection appears to have been partially
created, but it's in the "Gone" state, and after some time, is
deleted by an apparent cleanup process. If I try to create one
through the command line ..

     ./bin/solr create -c test99 -n _default -s 2 -rf 2

I get this response ..

ERROR: Failed to create collection 'test99' due to:
{10.6.208.31:8984_solr=org.apache.solr.client.solrj.SolrServerExcepti
on:IOException occured when talking to server at:
http://10.6.208.31:8984/solr,
10.6.208.31:8985_solr=org.apache.solr.client.solrj.SolrServerExceptio
n:IOException occured when talking to server at:
http://10.6.208.31:8985/solr,
10.6.208.31:8983_solr=org.apache.solr.client.solrj.SolrServerExceptio
n:IOException occured when talking to server at:
http://10.6.208.31:8983/solr}

This sounds like either network connectivity problems or possibly
issues caused by extreme garbage collection pauses that result in
timeouts.

Thanks,
Shawn


Thanks, Shawn. I was wondering if there was something going on with IP 
redirection that was causing confusion. Any thoughts on how to debug?
And, what do you mean by "extreme garbage collection pauses"? Is that Solr 
garbage collection or the OS itself? There's really nothing happening on this machine, 
it's purely for testing so there shouldn't be any extra load from other processes.

Thanks!
...scott







RE: SolrCloud installation troubles...

2018-01-29 Thread Davis, Daniel (NIH/NLM) [C]
Trying 127.0.0.1 could help.   We kind of tend to think localhost is always 
127.0.0.1, but I've seen localhost start to resolve to ::1, the IPv6 equivalent 
of 127.0.0.1.

I guess some environments can be strict enough to restrict communication on 
localhost; seems hard to imagine, but it does happen.

-Original Message-
From: Scott Prentice [mailto:s...@leximation.com] 
Sent: Monday, January 29, 2018 4:02 PM
To: solr-user@lucene.apache.org
Subject: Re: SolrCloud installation troubles...


On 1/29/18 12:44 PM, Shawn Heisey wrote:
> On 1/29/2018 1:13 PM, Scott Prentice wrote:
>> But when I do the same thing on the Red Hat system it fails. Through 
>> the UI, it'll first time out with this message ..
>>
>>     Connection to Solr lost
>>
>> Then after a refresh, the collection appears to have been partially 
>> created, but it's in the "Gone" state, and after some time, is 
>> deleted by an apparent cleanup process. If I try to create one 
>> through the command line ..
>>
>>     ./bin/solr create -c test99 -n _default -s 2 -rf 2
>>
>> I get this response ..
>>
>> ERROR: Failed to create collection 'test99' due to: 
>> {10.6.208.31:8984_solr=org.apache.solr.client.solrj.SolrServerExcepti
>> on:IOException occured when talking to server at: 
>> http://10.6.208.31:8984/solr, 
>> 10.6.208.31:8985_solr=org.apache.solr.client.solrj.SolrServerExceptio
>> n:IOException occured when talking to server at: 
>> http://10.6.208.31:8985/solr, 
>> 10.6.208.31:8983_solr=org.apache.solr.client.solrj.SolrServerExceptio
>> n:IOException occured when talking to server at: 
>> http://10.6.208.31:8983/solr}
>
> This sounds like either network connectivity problems or possibly 
> issues caused by extreme garbage collection pauses that result in 
> timeouts.
>
> Thanks,
> Shawn
>
Thanks, Shawn. I was wondering if there was something going on with IP 
redirection that was causing confusion. Any thoughts on how to debug? 
And, what do you mean by "extreme garbage collection pauses"? Is that Solr 
garbage collection or the OS itself? There's really nothing happening on this 
machine, it's purely for testing so there shouldn't be any extra load from 
other processes.

Thanks!
...scott





Re: SolrCloud installation troubles...

2018-01-29 Thread Scott Prentice
Interesting. I am using "localhost" in the config files (using the IP 
caused things to break even worse). But perhaps I should check with IT 
to make sure the ports are all open.


Thanks,
...scott


On 1/29/18 12:57 PM, Davis, Daniel (NIH/NLM) [C] wrote:

To expand on that answer, you have to wonder what ports are open in the server 
system's port-based firewall.I have to ask my systems team to open ports 
for everything I'm using, especially when I move from localhost to outside.

You should be able to "fake it out" if you set up your zookeeper configuration 
to use localhost ports.

-Original Message-
From: Scott Prentice [mailto:s...@leximation.com]
Sent: Monday, January 29, 2018 3:13 PM
To: solr-user@lucene.apache.org
Subject: SolrCloud installation troubles...

Using Solr 7.2.0 and Zookeeper 3.4.11

In an effort to move to a more robust Solr environment, I'm setting up a 
prototype system of 3 Solr servers and 3 Zookeeper servers. For now, this is 
all on one machine, but will eventually be 3 machines.

This works fine on a Ubuntu 5.4.0-6 VM on my local system, but when I do the same setup on 
the company's network machine (a Red Hat 4.8.5-16 VM), I'm unable to create a collection. To 
keep things simple, I'm not using our custom schema yet, but just creating a collection 
through the Solr Admin UI using Collections > Add Collection, using the 
"_default" config set. On the Ubuntu system, I can create various collections .. 1 
shard w/ 1 replication .. 2 shards w/ 3 replications .. 3 shards w/ 4 replications .. all 
seem alive and well.

But when I do the same thing on the Red Hat system it fails. Through the UI, 
it'll first time out with this message ..

      Connection to Solr lost

Then after a refresh, the collection appears to have been partially created, but it's in 
the "Gone" state, and after some time, is deleted by an apparent cleanup 
process. If I try to create one through the command line ..

      ./bin/solr create -c test99 -n _default -s 2 -rf 2

I get this response ..

ERROR: Failed to create collection 'test99' due to:
{10.6.208.31:8984_solr=org.apache.solr.client.solrj.SolrServerException:IOException
occured when talking to server at: http://10.6.208.31:8984/solr, 
10.6.208.31:8985_solr=org.apache.solr.client.solrj.SolrServerException:IOException
occured when talking to server at: http://10.6.208.31:8985/solr, 
10.6.208.31:8983_solr=org.apache.solr.client.solrj.SolrServerException:IOException
occured when talking to server at: http://10.6.208.31:8983/solr}

I've seen other reports of errors like this but no solutions that seem to apply 
to my situation. Any thoughts?

Thanks!
...scott






Re: SolrCloud installation troubles...

2018-01-29 Thread Scott Prentice


On 1/29/18 12:44 PM, Shawn Heisey wrote:

On 1/29/2018 1:13 PM, Scott Prentice wrote:
But when I do the same thing on the Red Hat system it fails. Through 
the UI, it'll first time out with this message ..


    Connection to Solr lost

Then after a refresh, the collection appears to have been partially 
created, but it's in the "Gone" state, and after some time, is 
deleted by an apparent cleanup process. If I try to create one 
through the command line ..


    ./bin/solr create -c test99 -n _default -s 2 -rf 2

I get this response ..

ERROR: Failed to create collection 'test99' due to: 
{10.6.208.31:8984_solr=org.apache.solr.client.solrj.SolrServerException:IOException 
occured when talking to server at: http://10.6.208.31:8984/solr, 
10.6.208.31:8985_solr=org.apache.solr.client.solrj.SolrServerException:IOException 
occured when talking to server at: http://10.6.208.31:8985/solr, 
10.6.208.31:8983_solr=org.apache.solr.client.solrj.SolrServerException:IOException 
occured when talking to server at: http://10.6.208.31:8983/solr} 


This sounds like either network connectivity problems or possibly 
issues caused by extreme garbage collection pauses that result in 
timeouts.


Thanks,
Shawn

Thanks, Shawn. I was wondering if there was something going on with IP 
redirection that was causing confusion. Any thoughts on how to debug? 
And, what do you mean by "extreme garbage collection pauses"? Is that 
Solr garbage collection or the OS itself? There's really nothing 
happening on this machine, it's purely for testing so there shouldn't be 
any extra load from other processes.


Thanks!
...scott





RE: SolrCloud installation troubles...

2018-01-29 Thread Davis, Daniel (NIH/NLM) [C]
To expand on that answer, you have to wonder what ports are open in the server 
system's port-based firewall.I have to ask my systems team to open ports 
for everything I'm using, especially when I move from localhost to outside.

You should be able to "fake it out" if you set up your zookeeper configuration 
to use localhost ports.

-Original Message-
From: Scott Prentice [mailto:s...@leximation.com] 
Sent: Monday, January 29, 2018 3:13 PM
To: solr-user@lucene.apache.org
Subject: SolrCloud installation troubles...

Using Solr 7.2.0 and Zookeeper 3.4.11

In an effort to move to a more robust Solr environment, I'm setting up a 
prototype system of 3 Solr servers and 3 Zookeeper servers. For now, this is 
all on one machine, but will eventually be 3 machines.

This works fine on a Ubuntu 5.4.0-6 VM on my local system, but when I do the 
same setup on the company's network machine (a Red Hat 4.8.5-16 VM), I'm unable 
to create a collection. To keep things simple, I'm not using our custom schema 
yet, but just creating a collection through the Solr Admin UI using Collections 
> Add Collection, using the "_default" config set. On the Ubuntu system, I can 
create various collections .. 1 shard w/ 1 replication .. 2 shards w/ 3 
replications .. 3 shards w/ 4 replications .. all seem alive and well.

But when I do the same thing on the Red Hat system it fails. Through the UI, 
it'll first time out with this message ..

     Connection to Solr lost

Then after a refresh, the collection appears to have been partially created, 
but it's in the "Gone" state, and after some time, is deleted by an apparent 
cleanup process. If I try to create one through the command line ..

     ./bin/solr create -c test99 -n _default -s 2 -rf 2

I get this response ..

ERROR: Failed to create collection 'test99' due to: 
{10.6.208.31:8984_solr=org.apache.solr.client.solrj.SolrServerException:IOException
occured when talking to server at: http://10.6.208.31:8984/solr, 
10.6.208.31:8985_solr=org.apache.solr.client.solrj.SolrServerException:IOException
occured when talking to server at: http://10.6.208.31:8985/solr, 
10.6.208.31:8983_solr=org.apache.solr.client.solrj.SolrServerException:IOException
occured when talking to server at: http://10.6.208.31:8983/solr}

I've seen other reports of errors like this but no solutions that seem to apply 
to my situation. Any thoughts?

Thanks!
...scott




Re: SolrCloud installation troubles...

2018-01-29 Thread Shawn Heisey

On 1/29/2018 1:13 PM, Scott Prentice wrote:
But when I do the same thing on the Red Hat system it fails. Through 
the UI, it'll first time out with this message ..


    Connection to Solr lost

Then after a refresh, the collection appears to have been partially 
created, but it's in the "Gone" state, and after some time, is deleted 
by an apparent cleanup process. If I try to create one through the 
command line ..


    ./bin/solr create -c test99 -n _default -s 2 -rf 2

I get this response ..

ERROR: Failed to create collection 'test99' due to: 
{10.6.208.31:8984_solr=org.apache.solr.client.solrj.SolrServerException:IOException 
occured when talking to server at: http://10.6.208.31:8984/solr, 
10.6.208.31:8985_solr=org.apache.solr.client.solrj.SolrServerException:IOException 
occured when talking to server at: http://10.6.208.31:8985/solr, 
10.6.208.31:8983_solr=org.apache.solr.client.solrj.SolrServerException:IOException 
occured when talking to server at: http://10.6.208.31:8983/solr} 


This sounds like either network connectivity problems or possibly issues 
caused by extreme garbage collection pauses that result in timeouts.


Thanks,
Shawn



SolrCloud installation troubles...

2018-01-29 Thread Scott Prentice

Using Solr 7.2.0 and Zookeeper 3.4.11

In an effort to move to a more robust Solr environment, I'm setting up a 
prototype system of 3 Solr servers and 3 Zookeeper servers. For now, 
this is all on one machine, but will eventually be 3 machines.


This works fine on a Ubuntu 5.4.0-6 VM on my local system, but when I do 
the same setup on the company's network machine (a Red Hat 4.8.5-16 VM), 
I'm unable to create a collection. To keep things simple, I'm not using 
our custom schema yet, but just creating a collection through the Solr 
Admin UI using Collections > Add Collection, using the "_default" config 
set. On the Ubuntu system, I can create various collections .. 1 shard 
w/ 1 replication .. 2 shards w/ 3 replications .. 3 shards w/ 4 
replications .. all seem alive and well.


But when I do the same thing on the Red Hat system it fails. Through the 
UI, it'll first time out with this message ..


    Connection to Solr lost

Then after a refresh, the collection appears to have been partially 
created, but it's in the "Gone" state, and after some time, is deleted 
by an apparent cleanup process. If I try to create one through the 
command line ..


    ./bin/solr create -c test99 -n _default -s 2 -rf 2

I get this response ..

ERROR: Failed to create collection 'test99' due to: 
{10.6.208.31:8984_solr=org.apache.solr.client.solrj.SolrServerException:IOException 
occured when talking to server at: http://10.6.208.31:8984/solr, 
10.6.208.31:8985_solr=org.apache.solr.client.solrj.SolrServerException:IOException 
occured when talking to server at: http://10.6.208.31:8985/solr, 
10.6.208.31:8983_solr=org.apache.solr.client.solrj.SolrServerException:IOException 
occured when talking to server at: http://10.6.208.31:8983/solr}


I've seen other reports of errors like this but no solutions that seem 
to apply to my situation. Any thoughts?


Thanks!
...scott