Re: [ovirt-users] oVirt 4.0.x - hosted-engine was not starting properly

2016-09-29 Thread Gervais de Montbrun
Hi Simone,

Thanks for the info. I'll look at the solution that you suggested.

Cheers,
Gervais



> On Sep 29, 2016, at 10:01 AM, Simone Tiraboschi  wrote:
> 
> 
> 
> On Thu, Sep 29, 2016 at 2:51 PM, Gervais de Montbrun  > wrote:
> Hi Martin,
> 
> The entropy was super low. Somewhere around 140. I installed and configured 
> haveged.service to start at bootup, reverted my apache changes... After a 
> reboot, my systemctl status still says that there are 7 services queued (note 
> that I erroneously said degraded in my previous email - the services are, in 
> fact, queued), but the oVirt GUI comes up almost immediately and everything 
> seems to be great.
> 
> 
> Take care that using havaged on a VM should not be considered a good source 
> of entropy and the oVirt PKi is managed by the engine.
> http://security.stackexchange.com/questions/34523/is-it-appropriate-to-use-haveged-as-a-source-of-entropy-on-virtual-machines
>  
> 
> 
> A better approach is the virtio-rng paravirtualised rng driver as for patch 
> https://gerrit.ovirt.org/#/c/62334/ 
> 
>  
> Thank you for the tip. You solved my issue.
> 
> Cheers,
> Gervais
> 
> 
> 
>> On Sep 29, 2016, at 7:47 AM, Martin Perina > > wrote:
>> 
>> Hi,
>> 
>> please take a look at my inline comments:
>> 
>> On Tue, Sep 27, 2016 at 7:23 PM, Gervais de Montbrun > > wrote:
>> Hey All,
>> 
>> Since updating to 4.0.x of oVirt, I have had an issue with my hosted engine. 
>> After a some poking around, I think I have figured out my issue and thought 
>> I would share to see what others think.
>> The issue has existed with 4.0, 4.0.1, 4.0.2, 4.0.3, and still exists in 
>> 4.0.4.
>> 
>> Description:
>> When my hosted engine starts it reports that it is in a degraded state with 
>> 7 or 8 services still not started when I run systemctl status. It takes 
>> about 6 or 7 minutes to eventually start all the services and come online. 
>> If I don't set my cluster to Global-Maintenance mode it eventually thinks 
>> that my hosted-engine needs to be rebooted and restarts it before it can 
>> start everything.
>> 
>> ​Could you please share with us logs gathered by ovirt-log-collector?
>> 
>> It's just a guess but could you please take a look if you HE VM has enough 
>> entropy?
>> 
>>   cat /proc/sys/kernel/random/entropy_avail
>> 
>> If the value is low (below or around 200),  you really need to install and 
>> configure some entropy generator such as haveged
>> 
>> 
>> Solution:
>> I realized that Apache was the culprit and found that the proxy to the 
>> ovirt-engine in /etc/httpd/conf.d/z-ovirt-engine-proxy.conf has a super long 
>> timeout with many retries. I changed the settings and now everything works 
>> for me.
>> 
>> -> Before change:
>> > ^/(ovirt-engine($|/)|api($|/)|RHEVManagerWeb/|OvirtEngineWeb/|ca.crt$|engine.ssh.key.txt$|rhevm.ssh.key.txt$)>
>> ProxyPassMatch ajp://127.0.0.1:8702 <> timeout=3600 retry=5
>> 
>> 
>> AddOutputFilterByType DEFLATE text/javascript text/css text/html 
>> text/xml text/json application/xml application/json application/x-yaml
>> 
>> 
>> 
>> -> After change:
>> 
>> ProxyPassMatch ajp://127.0.0.1:8702 <> timeout=5 retry=2
>> 
>> 
>> AddOutputFilterByType DEFLATE text/javascript text/css text/html 
>> text/xml text/json application/xml application/json application/x-yaml
>> 
>> 
>> 
>> ​This one is correct for 4.0​​, not sure why it was not updated during 
>> upgrade from 3.6. @Simone?
>> ​ 
>> 
>> If I read the timeout settings correctly, it will wait 60 minutes with 5 
>> retries. 5 hours is way too long for my little server to hold onto all those 
>> apache processes.
>> The change I made allows for there to be an error, and also releases 
>> apache's hold on the process. Once everything is ready, apache is ready to 
>> serve requests and everything/everyone is happy. Before making the change, I 
>> just get a whitescreen in my browser and then nothing works until I restart 
>> Apache (or I end up in an endless loop of ovirt-ha services restarting my 
>> hosted-engine.
>> 
>> ​Well, if you have an issue with too many apache processes waiting for 
>> engine to respond, then there's some issue in engine. As I wrote above 
>> please share the logs with us and check entropy.
>> 
>> Thanks
>> 
>> Martin Perina
>> ​ 
>> 
>> I noticed that this setting reverts to the original setting, so oVirt must 
>> be writing this file. Perhaps these number can be changed in oVirt? If not, 
>> I will just setup and ansible play to revert the settings with working 
>> values and restart apache on my engine.
>> :-)
>> 
>> 

Re: [ovirt-users] oVirt 4.0.x - hosted-engine was not starting properly

2016-09-29 Thread Gervais de Montbrun
Hi Simone,

Yes... I guess it was not clear in my original email. I changed the numbers 
myself to lower the timeout and retries. With them set as they were set by 
ovirt (timeout=3600 retry=5) things were not working for me. 

Cheers,
Gervais



> On Sep 29, 2016, at 10:04 AM, Simone Tiraboschi  wrote:
> 
> 
> 
> On Thu, Sep 29, 2016 at 12:47 PM, Martin Perina  > wrote:
> Hi,
> 
> please take a look at my inline comments:
> 
> On Tue, Sep 27, 2016 at 7:23 PM, Gervais de Montbrun  > wrote:
> Hey All,
> 
> Since updating to 4.0.x of oVirt, I have had an issue with my hosted engine. 
> After a some poking around, I think I have figured out my issue and thought I 
> would share to see what others think.
> The issue has existed with 4.0, 4.0.1, 4.0.2, 4.0.3, and still exists in 
> 4.0.4.
> 
> Description:
> When my hosted engine starts it reports that it is in a degraded state with 7 
> or 8 services still not started when I run systemctl status. It takes about 6 
> or 7 minutes to eventually start all the services and come online. If I don't 
> set my cluster to Global-Maintenance mode it eventually thinks that my 
> hosted-engine needs to be rebooted and restarts it before it can start 
> everything.
> 
> ​Could you please share with us logs gathered by ovirt-log-collector?
> 
> It's just a guess but could you please take a look if you HE VM has enough 
> entropy?
> 
>   cat /proc/sys/kernel/random/entropy_avail
> 
> If the value is low (below or around 200),  you really need to install and 
> configure some entropy generator such as haveged
> 
> 
> Solution:
> I realized that Apache was the culprit and found that the proxy to the 
> ovirt-engine in /etc/httpd/conf.d/z-ovirt-engine-proxy.conf has a super long 
> timeout with many retries. I changed the settings and now everything works 
> for me.
> 
> -> Before change:
>  ^/(ovirt-engine($|/)|api($|/)|RHEVManagerWeb/|OvirtEngineWeb/|ca.crt$|engine.ssh.key.txt$|rhevm.ssh.key.txt$)>
> ProxyPassMatch ajp://127.0.0.1:8702 <> timeout=3600 retry=5
> 
> 
> AddOutputFilterByType DEFLATE text/javascript text/css text/html 
> text/xml text/json application/xml application/json application/x-yaml
> 
> 
> 
> -> After change:
> 
> ProxyPassMatch ajp://127.0.0.1:8702 <> timeout=5 retry=2
> 
> 
> AddOutputFilterByType DEFLATE text/javascript text/css text/html 
> text/xml text/json application/xml application/json application/x-yaml
> 
> 
> 
> ​This one is correct for 4.0​​, not sure why it was not updated during 
> upgrade from 3.6. @Simone?
> ​
> 
> Honestly it's
> 
> ProxyPassMatch ajp://127.0.0.1:8702  
> timeout=3600 retry=5
> 
> 
> AddOutputFilterByType DEFLATE text/javascript text/css text/html 
> text/xml text/json application/xml application/json application/x-yaml
> 
> 
> also on a fresh 4.0 engine from our latest engine-appliance.
>  
> 
> If I read the timeout settings correctly, it will wait 60 minutes with 5 
> retries. 5 hours is way too long for my little server to hold onto all those 
> apache processes.
> The change I made allows for there to be an error, and also releases apache's 
> hold on the process. Once everything is ready, apache is ready to serve 
> requests and everything/everyone is happy. Before making the change, I just 
> get a whitescreen in my browser and then nothing works until I restart Apache 
> (or I end up in an endless loop of ovirt-ha services restarting my 
> hosted-engine.
> 
> ​Well, if you have an issue with too many apache processes waiting for engine 
> to respond, then there's some issue in engine. As I wrote above please share 
> the logs with us and check entropy.
> 
> Thanks
> 
> Martin Perina
> ​ 
> 
> I noticed that this setting reverts to the original setting, so oVirt must be 
> writing this file. Perhaps these number can be changed in oVirt? If not, I 
> will just setup and ansible play to revert the settings with working values 
> and restart apache on my engine.
> :-)
> 
> Cheers,
> Gervais
> 
> 
> 
> 
> ___
> Users mailing list
> Users@ovirt.org 
> http://lists.ovirt.org/mailman/listinfo/users 
> 
> 
> 
> 

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] oVirt 4.0.x - hosted-engine was not starting properly

2016-09-29 Thread Martin Perina
On Thu, Sep 29, 2016 at 3:04 PM, Simone Tiraboschi 
wrote:

>
>
> On Thu, Sep 29, 2016 at 12:47 PM, Martin Perina 
> wrote:
>
>> Hi,
>>
>> please take a look at my inline comments:
>>
>> On Tue, Sep 27, 2016 at 7:23 PM, Gervais de Montbrun <
>> gerv...@demontbrun.com> wrote:
>>
>>> Hey All,
>>>
>>> Since updating to 4.0.x of oVirt, I have had an issue with my hosted
>>> engine. After a some poking around, I think I have figured out my issue and
>>> thought I would share to see what others think.
>>> The issue has existed with 4.0, 4.0.1, 4.0.2, 4.0.3, and still exists in
>>> 4.0.4.
>>>
>>> Description:
>>> When my hosted engine starts it reports that it is in a degraded state
>>> with 7 or 8 services still not started when I run systemctl status. It
>>> takes about 6 or 7 minutes to eventually start all the services and come
>>> online. If I don't set my cluster to Global-Maintenance mode it eventually
>>> thinks that my hosted-engine needs to be rebooted and restarts it before it
>>> can start everything.
>>>
>>
>> ​Could you please share with us logs gathered by ovirt-log-collector?
>>
>> It's just a guess but could you please take a look if you HE VM has
>> enough entropy?
>>
>>   cat /proc/sys/kernel/random/entropy_avail
>>
>> If the value is low (below or around 200),  you really need to install
>> and configure some entropy generator such as haveged
>>
>>
>>> Solution:
>>> I realized that Apache was the culprit and found that the proxy to the
>>> ovirt-engine in /etc/httpd/conf.d/z-ovirt-engine-proxy.conf has a super
>>> long timeout with many retries. I changed the settings and now everything
>>> works for me.
>>>
>>> -> Before change:
>>>
>>> >> RHEVManagerWeb/|OvirtEngineWeb/|ca.crt$|engine.ssh.key.txt$|
>>> rhevm.ssh.key.txt$)>
>>> ProxyPassMatch ajp://127.0.0.1:8702 timeout=3600 retry=5
>>>
>>> 
>>> AddOutputFilterByType DEFLATE text/javascript text/css
>>> text/html text/xml text/json application/xml application/json
>>> application/x-yaml
>>> 
>>> 
>>>
>>>
>>> -> After change:
>>>
>>> 
>>> ProxyPassMatch ajp://127.0.0.1:8702 timeout=5 retry=2
>>>
>>> 
>>> AddOutputFilterByType DEFLATE text/javascript text/css
>>> text/html text/xml text/json application/xml application/json
>>> application/x-yaml
>>> 
>>> 
>>>
>>>
>> ​This one is correct for 4.0​
>> ​, not sure why it was not updated during upgrade from 3.6. @Simone?
>> ​
>>
>
> Honestly it's
> 
> ProxyPassMatch ajp://127.0.0.1:8702 timeout=3600 retry=5
>
> 
> AddOutputFilterByType DEFLATE text/javascript text/css
> text/html text/xml text/json application/xml application/json
> application/x-yaml
> 
> 
> also on a fresh 4.0 engine from our latest engine-appliance.
>

​Right, I missed the timeout​/retry option changes. But the important part
is why old configuration (with different LocationMatch) was not overwritten
during upgrade.


>
>>
>>> If I read the timeout settings correctly, it will wait 60 minutes with 5
>>> retries. 5 hours is way too long for my little server to hold onto all
>>> those apache processes.
>>>
>> The change I made allows for there to be an error, and also releases
>>> apache's hold on the process. Once everything is ready, apache is ready to
>>> serve requests and everything/everyone is happy. Before making the change,
>>> I just get a whitescreen in my browser and then nothing works until I
>>> restart Apache (or I end up in an endless loop of ovirt-ha services
>>> restarting my hosted-engine.
>>>
>>
>> ​Well, if you have an issue with too many apache processes waiting for
>> engine to respond, then there's some issue in engine. As I wrote above
>> please share the logs with us and check entropy.
>>
>> Thanks
>>
>> Martin Perina
>> ​
>>
>>
>>>
>>> I noticed that this setting reverts to the original setting, so oVirt
>>> must be writing this file. Perhaps these number can be changed in oVirt? If
>>> not, I will just setup and ansible play to revert the settings with working
>>> values and restart apache on my engine.
>>> :-)
>>>
>>> Cheers,
>>> Gervais
>>>
>>>
>>>
>>>
>>> ___
>>> Users mailing list
>>> Users@ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/users
>>>
>>>
>>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] oVirt 4.0.x - hosted-engine was not starting properly

2016-09-29 Thread Simone Tiraboschi
On Thu, Sep 29, 2016 at 3:11 PM, Martin Perina  wrote:

>
>
> On Thu, Sep 29, 2016 at 3:04 PM, Simone Tiraboschi 
> wrote:
>
>>
>>
>> On Thu, Sep 29, 2016 at 12:47 PM, Martin Perina 
>> wrote:
>>
>>> Hi,
>>>
>>> please take a look at my inline comments:
>>>
>>> On Tue, Sep 27, 2016 at 7:23 PM, Gervais de Montbrun <
>>> gerv...@demontbrun.com> wrote:
>>>
 Hey All,

 Since updating to 4.0.x of oVirt, I have had an issue with my hosted
 engine. After a some poking around, I think I have figured out my issue and
 thought I would share to see what others think.
 The issue has existed with 4.0, 4.0.1, 4.0.2, 4.0.3, and still exists
 in 4.0.4.

 Description:
 When my hosted engine starts it reports that it is in a degraded state
 with 7 or 8 services still not started when I run systemctl status. It
 takes about 6 or 7 minutes to eventually start all the services and come
 online. If I don't set my cluster to Global-Maintenance mode it eventually
 thinks that my hosted-engine needs to be rebooted and restarts it before it
 can start everything.

>>>
>>> ​Could you please share with us logs gathered by ovirt-log-collector?
>>>
>>> It's just a guess but could you please take a look if you HE VM has
>>> enough entropy?
>>>
>>>   cat /proc/sys/kernel/random/entropy_avail
>>>
>>> If the value is low (below or around 200),  you really need to install
>>> and configure some entropy generator such as haveged
>>>
>>>
 Solution:
 I realized that Apache was the culprit and found that the proxy to the
 ovirt-engine in /etc/httpd/conf.d/z-ovirt-engine-proxy.conf has a
 super long timeout with many retries. I changed the settings and now
 everything works for me.

 -> Before change:

 >>> RHEVManagerWeb/|OvirtEngineWeb/|ca.crt$|engine.ssh.key.txt$|
 rhevm.ssh.key.txt$)>
 ProxyPassMatch ajp://127.0.0.1:8702 timeout=3600 retry=5

 
 AddOutputFilterByType DEFLATE text/javascript text/css
 text/html text/xml text/json application/xml application/json
 application/x-yaml
 
 


 -> After change:

 
 ProxyPassMatch ajp://127.0.0.1:8702 timeout=5 retry=2

 
 AddOutputFilterByType DEFLATE text/javascript text/css
 text/html text/xml text/json application/xml application/json
 application/x-yaml
 
 


>>> ​This one is correct for 4.0​
>>> ​, not sure why it was not updated during upgrade from 3.6. @Simone?
>>> ​
>>>
>>
>> Honestly it's
>> 
>> ProxyPassMatch ajp://127.0.0.1:8702 timeout=3600 retry=5
>>
>> 
>> AddOutputFilterByType DEFLATE text/javascript text/css
>> text/html text/xml text/json application/xml application/json
>> application/x-yaml
>> 
>> 
>> also on a fresh 4.0 engine from our latest engine-appliance.
>>
>
> ​Right, I missed the timeout​/retry option changes. But the important part
> is why old configuration (with different LocationMatch) was not overwritten
> during upgrade.
>
>
I suspect that it could got overwritten a second time to its 3.6 value in
our backup/restore procedure.
Adding Didi here.


>
>>
>>>
 If I read the timeout settings correctly, it will wait 60 minutes with
 5 retries. 5 hours is way too long for my little server to hold onto all
 those apache processes.

>>> The change I made allows for there to be an error, and also releases
 apache's hold on the process. Once everything is ready, apache is ready to
 serve requests and everything/everyone is happy. Before making the change,
 I just get a whitescreen in my browser and then nothing works until I
 restart Apache (or I end up in an endless loop of ovirt-ha services
 restarting my hosted-engine.

>>>
>>> ​Well, if you have an issue with too many apache processes waiting for
>>> engine to respond, then there's some issue in engine. As I wrote above
>>> please share the logs with us and check entropy.
>>>
>>> Thanks
>>>
>>> Martin Perina
>>> ​
>>>
>>>

 I noticed that this setting reverts to the original setting, so oVirt
 must be writing this file. Perhaps these number can be changed in oVirt? If
 not, I will just setup and ansible play to revert the settings with working
 values and restart apache on my engine.
 :-)

 Cheers,
 Gervais




 ___
 Users mailing list
 Users@ovirt.org
 http://lists.ovirt.org/mailman/listinfo/users


>>>
>>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] oVirt 4.0.x - hosted-engine was not starting properly

2016-09-29 Thread Simone Tiraboschi
On Thu, Sep 29, 2016 at 12:47 PM, Martin Perina  wrote:

> Hi,
>
> please take a look at my inline comments:
>
> On Tue, Sep 27, 2016 at 7:23 PM, Gervais de Montbrun <
> gerv...@demontbrun.com> wrote:
>
>> Hey All,
>>
>> Since updating to 4.0.x of oVirt, I have had an issue with my hosted
>> engine. After a some poking around, I think I have figured out my issue and
>> thought I would share to see what others think.
>> The issue has existed with 4.0, 4.0.1, 4.0.2, 4.0.3, and still exists in
>> 4.0.4.
>>
>> Description:
>> When my hosted engine starts it reports that it is in a degraded state
>> with 7 or 8 services still not started when I run systemctl status. It
>> takes about 6 or 7 minutes to eventually start all the services and come
>> online. If I don't set my cluster to Global-Maintenance mode it eventually
>> thinks that my hosted-engine needs to be rebooted and restarts it before it
>> can start everything.
>>
>
> ​Could you please share with us logs gathered by ovirt-log-collector?
>
> It's just a guess but could you please take a look if you HE VM has enough
> entropy?
>
>   cat /proc/sys/kernel/random/entropy_avail
>
> If the value is low (below or around 200),  you really need to install and
> configure some entropy generator such as haveged
>
>
>> Solution:
>> I realized that Apache was the culprit and found that the proxy to the
>> ovirt-engine in /etc/httpd/conf.d/z-ovirt-engine-proxy.conf has a super
>> long timeout with many retries. I changed the settings and now everything
>> works for me.
>>
>> -> Before change:
>>
>> > RHEVManagerWeb/|OvirtEngineWeb/|ca.crt$|engine.ssh.key.txt$|
>> rhevm.ssh.key.txt$)>
>> ProxyPassMatch ajp://127.0.0.1:8702 timeout=3600 retry=5
>>
>> 
>> AddOutputFilterByType DEFLATE text/javascript text/css
>> text/html text/xml text/json application/xml application/json
>> application/x-yaml
>> 
>> 
>>
>>
>> -> After change:
>>
>> 
>> ProxyPassMatch ajp://127.0.0.1:8702 timeout=5 retry=2
>>
>> 
>> AddOutputFilterByType DEFLATE text/javascript text/css
>> text/html text/xml text/json application/xml application/json
>> application/x-yaml
>> 
>> 
>>
>>
> ​This one is correct for 4.0​
> ​, not sure why it was not updated during upgrade from 3.6. @Simone?
> ​
>

Honestly it's

ProxyPassMatch ajp://127.0.0.1:8702 timeout=3600 retry=5


AddOutputFilterByType DEFLATE text/javascript text/css
text/html text/xml text/json application/xml application/json
application/x-yaml


also on a fresh 4.0 engine from our latest engine-appliance.


>
>> If I read the timeout settings correctly, it will wait 60 minutes with 5
>> retries. 5 hours is way too long for my little server to hold onto all
>> those apache processes.
>>
> The change I made allows for there to be an error, and also releases
>> apache's hold on the process. Once everything is ready, apache is ready to
>> serve requests and everything/everyone is happy. Before making the change,
>> I just get a whitescreen in my browser and then nothing works until I
>> restart Apache (or I end up in an endless loop of ovirt-ha services
>> restarting my hosted-engine.
>>
>
> ​Well, if you have an issue with too many apache processes waiting for
> engine to respond, then there's some issue in engine. As I wrote above
> please share the logs with us and check entropy.
>
> Thanks
>
> Martin Perina
> ​
>
>
>>
>> I noticed that this setting reverts to the original setting, so oVirt
>> must be writing this file. Perhaps these number can be changed in oVirt? If
>> not, I will just setup and ansible play to revert the settings with working
>> values and restart apache on my engine.
>> :-)
>>
>> Cheers,
>> Gervais
>>
>>
>>
>>
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] oVirt 4.0.x - hosted-engine was not starting properly

2016-09-29 Thread Simone Tiraboschi
On Thu, Sep 29, 2016 at 2:51 PM, Gervais de Montbrun  wrote:

> Hi Martin,
>
> The entropy was super low. Somewhere around 140. I installed and
> configured haveged.service to start at bootup, reverted my apache
> changes... After a reboot, my systemctl status still says that there are 7
> services queued (note that I erroneously said degraded in my previous email
> - the services are, in fact, queued), but the oVirt GUI comes up almost
> immediately and everything seems to be great.
>
>
Take care that using havaged on a VM should not be considered a good source
of entropy and the oVirt PKi is managed by the engine.
http://security.stackexchange.com/questions/34523/is-it-
appropriate-to-use-haveged-as-a-source-of-entropy-on-virtual-machines

A better approach is the virtio-rng paravirtualised rng driver as for patch
https://gerrit.ovirt.org/#/c/62334/



> Thank you for the tip. You solved my issue.
>
> Cheers,
> Gervais
>
>
>
> On Sep 29, 2016, at 7:47 AM, Martin Perina  wrote:
>
> Hi,
>
> please take a look at my inline comments:
>
> On Tue, Sep 27, 2016 at 7:23 PM, Gervais de Montbrun <
> gerv...@demontbrun.com> wrote:
>
>> Hey All,
>>
>> Since updating to 4.0.x of oVirt, I have had an issue with my hosted
>> engine. After a some poking around, I think I have figured out my issue and
>> thought I would share to see what others think.
>> The issue has existed with 4.0, 4.0.1, 4.0.2, 4.0.3, and still exists in
>> 4.0.4.
>>
>> Description:
>> When my hosted engine starts it reports that it is in a degraded state
>> with 7 or 8 services still not started when I run systemctl status. It
>> takes about 6 or 7 minutes to eventually start all the services and come
>> online. If I don't set my cluster to Global-Maintenance mode it eventually
>> thinks that my hosted-engine needs to be rebooted and restarts it before it
>> can start everything.
>>
>
> ​Could you please share with us logs gathered by ovirt-log-collector?
>
> It's just a guess but could you please take a look if you HE VM has enough
> entropy?
>
>   cat /proc/sys/kernel/random/entropy_avail
>
> If the value is low (below or around 200),  you really need to install and
> configure some entropy generator such as haveged
>
>
>> Solution:
>> I realized that Apache was the culprit and found that the proxy to the
>> ovirt-engine in /etc/httpd/conf.d/z-ovirt-engine-proxy.conf has a super
>> long timeout with many retries. I changed the settings and now everything
>> works for me.
>>
>> -> Before change:
>>
>> > RHEVManagerWeb/|OvirtEngineWeb/|ca.crt$|engine.ssh.key.txt$|
>> rhevm.ssh.key.txt$)>
>> ProxyPassMatch ajp://127.0.0.1:8702 timeout=3600 retry=5
>>
>> 
>> AddOutputFilterByType DEFLATE text/javascript text/css
>> text/html text/xml text/json application/xml application/json
>> application/x-yaml
>> 
>> 
>>
>>
>> -> After change:
>>
>> 
>> ProxyPassMatch ajp://127.0.0.1:8702 timeout=5 retry=2
>>
>> 
>> AddOutputFilterByType DEFLATE text/javascript text/css
>> text/html text/xml text/json application/xml application/json
>> application/x-yaml
>> 
>> 
>>
>>
> ​This one is correct for 4.0​
> ​, not sure why it was not updated during upgrade from 3.6. @Simone?
> ​
>
>
>>
>> If I read the timeout settings correctly, it will wait 60 minutes with 5
>> retries. 5 hours is way too long for my little server to hold onto all
>> those apache processes.
>>
> The change I made allows for there to be an error, and also releases
>> apache's hold on the process. Once everything is ready, apache is ready to
>> serve requests and everything/everyone is happy. Before making the change,
>> I just get a whitescreen in my browser and then nothing works until I
>> restart Apache (or I end up in an endless loop of ovirt-ha services
>> restarting my hosted-engine.
>>
>
> ​Well, if you have an issue with too many apache processes waiting for
> engine to respond, then there's some issue in engine. As I wrote above
> please share the logs with us and check entropy.
>
> Thanks
>
> Martin Perina
> ​
>
>
>>
>> I noticed that this setting reverts to the original setting, so oVirt
>> must be writing this file. Perhaps these number can be changed in oVirt? If
>> not, I will just setup and ansible play to revert the settings with working
>> values and restart apache on my engine.
>> :-)
>>
>> Cheers,
>> Gervais
>>
>>
>>
>>
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>>
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] oVirt 4.0.x - hosted-engine was not starting properly

2016-09-29 Thread Gervais de Montbrun
Hi Martin,

The entropy was super low. Somewhere around 140. I installed and configured 
haveged.service to start at bootup, reverted my apache changes... After a 
reboot, my systemctl status still says that there are 7 services queued (note 
that I erroneously said degraded in my previous email - the services are, in 
fact, queued), but the oVirt GUI comes up almost immediately and everything 
seems to be great.

Thank you for the tip. You solved my issue.

Cheers,
Gervais



> On Sep 29, 2016, at 7:47 AM, Martin Perina  wrote:
> 
> Hi,
> 
> please take a look at my inline comments:
> 
> On Tue, Sep 27, 2016 at 7:23 PM, Gervais de Montbrun  > wrote:
> Hey All,
> 
> Since updating to 4.0.x of oVirt, I have had an issue with my hosted engine. 
> After a some poking around, I think I have figured out my issue and thought I 
> would share to see what others think.
> The issue has existed with 4.0, 4.0.1, 4.0.2, 4.0.3, and still exists in 
> 4.0.4.
> 
> Description:
> When my hosted engine starts it reports that it is in a degraded state with 7 
> or 8 services still not started when I run systemctl status. It takes about 6 
> or 7 minutes to eventually start all the services and come online. If I don't 
> set my cluster to Global-Maintenance mode it eventually thinks that my 
> hosted-engine needs to be rebooted and restarts it before it can start 
> everything.
> 
> ​Could you please share with us logs gathered by ovirt-log-collector?
> 
> It's just a guess but could you please take a look if you HE VM has enough 
> entropy?
> 
>   cat /proc/sys/kernel/random/entropy_avail
> 
> If the value is low (below or around 200),  you really need to install and 
> configure some entropy generator such as haveged
> 
> 
> Solution:
> I realized that Apache was the culprit and found that the proxy to the 
> ovirt-engine in /etc/httpd/conf.d/z-ovirt-engine-proxy.conf has a super long 
> timeout with many retries. I changed the settings and now everything works 
> for me.
> 
> -> Before change:
>  ^/(ovirt-engine($|/)|api($|/)|RHEVManagerWeb/|OvirtEngineWeb/|ca.crt$|engine.ssh.key.txt$|rhevm.ssh.key.txt$)>
> ProxyPassMatch ajp://127.0.0.1:8702 <> timeout=3600 retry=5
> 
> 
> AddOutputFilterByType DEFLATE text/javascript text/css text/html 
> text/xml text/json application/xml application/json application/x-yaml
> 
> 
> 
> -> After change:
> 
> ProxyPassMatch ajp://127.0.0.1:8702 <> timeout=5 retry=2
> 
> 
> AddOutputFilterByType DEFLATE text/javascript text/css text/html 
> text/xml text/json application/xml application/json application/x-yaml
> 
> 
> 
> ​This one is correct for 4.0​​, not sure why it was not updated during 
> upgrade from 3.6. @Simone?
> ​ 
> 
> If I read the timeout settings correctly, it will wait 60 minutes with 5 
> retries. 5 hours is way too long for my little server to hold onto all those 
> apache processes.
> The change I made allows for there to be an error, and also releases apache's 
> hold on the process. Once everything is ready, apache is ready to serve 
> requests and everything/everyone is happy. Before making the change, I just 
> get a whitescreen in my browser and then nothing works until I restart Apache 
> (or I end up in an endless loop of ovirt-ha services restarting my 
> hosted-engine.
> 
> ​Well, if you have an issue with too many apache processes waiting for engine 
> to respond, then there's some issue in engine. As I wrote above please share 
> the logs with us and check entropy.
> 
> Thanks
> 
> Martin Perina
> ​ 
> 
> I noticed that this setting reverts to the original setting, so oVirt must be 
> writing this file. Perhaps these number can be changed in oVirt? If not, I 
> will just setup and ansible play to revert the settings with working values 
> and restart apache on my engine.
> :-)
> 
> Cheers,
> Gervais
> 
> 
> 
> 
> ___
> Users mailing list
> Users@ovirt.org 
> http://lists.ovirt.org/mailman/listinfo/users 
> 
> 
> 

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] oVirt 4.0.x - hosted-engine was not starting properly

2016-09-29 Thread Artyom Lukianov
We have the same configuration under the file
/etc/httpd/conf.d/z-ovirt-engine-proxy.conf  for the regular engine under
3.6 and 4.0, so I do not sure if it relates to the problem.
About entropy level check the bug
https://bugzilla.redhat.com/show_bug.cgi?id=1357246.
Best Regards

On Thu, Sep 29, 2016 at 1:47 PM, Martin Perina  wrote:

> Hi,
>
> please take a look at my inline comments:
>
> On Tue, Sep 27, 2016 at 7:23 PM, Gervais de Montbrun <
> gerv...@demontbrun.com> wrote:
>
>> Hey All,
>>
>> Since updating to 4.0.x of oVirt, I have had an issue with my hosted
>> engine. After a some poking around, I think I have figured out my issue and
>> thought I would share to see what others think.
>> The issue has existed with 4.0, 4.0.1, 4.0.2, 4.0.3, and still exists in
>> 4.0.4.
>>
>> Description:
>> When my hosted engine starts it reports that it is in a degraded state
>> with 7 or 8 services still not started when I run systemctl status. It
>> takes about 6 or 7 minutes to eventually start all the services and come
>> online. If I don't set my cluster to Global-Maintenance mode it eventually
>> thinks that my hosted-engine needs to be rebooted and restarts it before it
>> can start everything.
>>
>
> ​Could you please share with us logs gathered by ovirt-log-collector?
>
> It's just a guess but could you please take a look if you HE VM has enough
> entropy?
>
>   cat /proc/sys/kernel/random/entropy_avail
>
> If the value is low (below or around 200),  you really need to install and
> configure some entropy generator such as haveged
>
>
>> Solution:
>> I realized that Apache was the culprit and found that the proxy to the
>> ovirt-engine in /etc/httpd/conf.d/z-ovirt-engine-proxy.conf has a super
>> long timeout with many retries. I changed the settings and now everything
>> works for me.
>>
>> -> Before change:
>>
>> > RHEVManagerWeb/|OvirtEngineWeb/|ca.crt$|engine.ssh.key.txt$|
>> rhevm.ssh.key.txt$)>
>> ProxyPassMatch ajp://127.0.0.1:8702 timeout=3600 retry=5
>>
>> 
>> AddOutputFilterByType DEFLATE text/javascript text/css
>> text/html text/xml text/json application/xml application/json
>> application/x-yaml
>> 
>> 
>>
>>
>> -> After change:
>>
>> 
>> ProxyPassMatch ajp://127.0.0.1:8702 timeout=5 retry=2
>>
>> 
>> AddOutputFilterByType DEFLATE text/javascript text/css
>> text/html text/xml text/json application/xml application/json
>> application/x-yaml
>> 
>> 
>>
>>
> ​This one is correct for 4.0​
> ​, not sure why it was not updated during upgrade from 3.6. @Simone?
> ​
>
>
>>
>> If I read the timeout settings correctly, it will wait 60 minutes with 5
>> retries. 5 hours is way too long for my little server to hold onto all
>> those apache processes.
>>
> The change I made allows for there to be an error, and also releases
>> apache's hold on the process. Once everything is ready, apache is ready to
>> serve requests and everything/everyone is happy. Before making the change,
>> I just get a whitescreen in my browser and then nothing works until I
>> restart Apache (or I end up in an endless loop of ovirt-ha services
>> restarting my hosted-engine.
>>
>
> ​Well, if you have an issue with too many apache processes waiting for
> engine to respond, then there's some issue in engine. As I wrote above
> please share the logs with us and check entropy.
>
> Thanks
>
> Martin Perina
> ​
>
>
>>
>> I noticed that this setting reverts to the original setting, so oVirt
>> must be writing this file. Perhaps these number can be changed in oVirt? If
>> not, I will just setup and ansible play to revert the settings with working
>> values and restart apache on my engine.
>> :-)
>>
>> Cheers,
>> Gervais
>>
>>
>>
>>
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] oVirt 4.0.x - hosted-engine was not starting properly

2016-09-29 Thread Martin Perina
Hi,

please take a look at my inline comments:

On Tue, Sep 27, 2016 at 7:23 PM, Gervais de Montbrun  wrote:

> Hey All,
>
> Since updating to 4.0.x of oVirt, I have had an issue with my hosted
> engine. After a some poking around, I think I have figured out my issue and
> thought I would share to see what others think.
> The issue has existed with 4.0, 4.0.1, 4.0.2, 4.0.3, and still exists in
> 4.0.4.
>
> Description:
> When my hosted engine starts it reports that it is in a degraded state
> with 7 or 8 services still not started when I run systemctl status. It
> takes about 6 or 7 minutes to eventually start all the services and come
> online. If I don't set my cluster to Global-Maintenance mode it eventually
> thinks that my hosted-engine needs to be rebooted and restarts it before it
> can start everything.
>

​Could you please share with us logs gathered by ovirt-log-collector?

It's just a guess but could you please take a look if you HE VM has enough
entropy?

  cat /proc/sys/kernel/random/entropy_avail

If the value is low (below or around 200),  you really need to install and
configure some entropy generator such as haveged


> Solution:
> I realized that Apache was the culprit and found that the proxy to the
> ovirt-engine in /etc/httpd/conf.d/z-ovirt-engine-proxy.conf has a super
> long timeout with many retries. I changed the settings and now everything
> works for me.
>
> -> Before change:
>
>  OvirtEngineWeb/|ca.crt$|engine.ssh.key.txt$|rhevm.ssh.key.txt$)>
> ProxyPassMatch ajp://127.0.0.1:8702 timeout=3600 retry=5
>
> 
> AddOutputFilterByType DEFLATE text/javascript text/css
> text/html text/xml text/json application/xml application/json
> application/x-yaml
> 
> 
>
>
> -> After change:
>
> 
> ProxyPassMatch ajp://127.0.0.1:8702 timeout=5 retry=2
>
> 
> AddOutputFilterByType DEFLATE text/javascript text/css
> text/html text/xml text/json application/xml application/json
> application/x-yaml
> 
> 
>
>
​This one is correct for 4.0​
​, not sure why it was not updated during upgrade from 3.6. @Simone?
​


>
> If I read the timeout settings correctly, it will wait 60 minutes with 5
> retries. 5 hours is way too long for my little server to hold onto all
> those apache processes.
>
The change I made allows for there to be an error, and also releases
> apache's hold on the process. Once everything is ready, apache is ready to
> serve requests and everything/everyone is happy. Before making the change,
> I just get a whitescreen in my browser and then nothing works until I
> restart Apache (or I end up in an endless loop of ovirt-ha services
> restarting my hosted-engine.
>

​Well, if you have an issue with too many apache processes waiting for
engine to respond, then there's some issue in engine. As I wrote above
please share the logs with us and check entropy.

Thanks

Martin Perina
​


>
> I noticed that this setting reverts to the original setting, so oVirt must
> be writing this file. Perhaps these number can be changed in oVirt? If not,
> I will just setup and ansible play to revert the settings with working
> values and restart apache on my engine.
> :-)
>
> Cheers,
> Gervais
>
>
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] oVirt 4.0.x - hosted-engine was not starting properly

2016-09-27 Thread Gervais de Montbrun
Hey All,

Since updating to 4.0.x of oVirt, I have had an issue with my hosted engine. 
After a some poking around, I think I have figured out my issue and thought I 
would share to see what others think.
The issue has existed with 4.0, 4.0.1, 4.0.2, 4.0.3, and still exists in 4.0.4.

Description:
When my hosted engine starts it reports that it is in a degraded state with 7 
or 8 services still not started when I run systemctl status. It takes about 6 
or 7 minutes to eventually start all the services and come online. If I don't 
set my cluster to Global-Maintenance mode it eventually thinks that my 
hosted-engine needs to be rebooted and restarts it before it can start 
everything.

Solution:
I realized that Apache was the culprit and found that the proxy to the 
ovirt-engine in /etc/httpd/conf.d/z-ovirt-engine-proxy.conf has a super long 
timeout with many retries. I changed the settings and now everything works for 
me.

-> Before change:

ProxyPassMatch ajp://127.0.0.1:8702 timeout=3600 retry=5


AddOutputFilterByType DEFLATE text/javascript text/css text/html 
text/xml text/json application/xml application/json application/x-yaml



-> After change:

ProxyPassMatch ajp://127.0.0.1:8702 timeout=5 retry=2


AddOutputFilterByType DEFLATE text/javascript text/css text/html 
text/xml text/json application/xml application/json application/x-yaml



If I read the timeout settings correctly, it will wait 60 minutes with 5 
retries. 5 hours is way too long for my little server to hold onto all those 
apache processes. The change I made allows for there to be an error, and also 
releases apache's hold on the process. Once everything is ready, apache is 
ready to serve requests and everything/everyone is happy. Before making the 
change, I just get a whitescreen in my browser and then nothing works until I 
restart Apache (or I end up in an endless loop of ovirt-ha services restarting 
my hosted-engine.

I noticed that this setting reverts to the original setting, so oVirt must be 
writing this file. Perhaps these number can be changed in oVirt? If not, I will 
just setup and ansible play to revert the settings with working values and 
restart apache on my engine.
:-)

Cheers,
Gervais



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users