On Thu, Sep 29, 2016 at 3:11 PM, Martin Perina <mper...@redhat.com> wrote:
> > > On Thu, Sep 29, 2016 at 3:04 PM, Simone Tiraboschi <stira...@redhat.com> > wrote: > >> >> >> On Thu, Sep 29, 2016 at 12:47 PM, Martin Perina <mper...@redhat.com> >> wrote: >> >>> Hi, >>> >>> please take a look at my inline comments: >>> >>> On Tue, Sep 27, 2016 at 7:23 PM, Gervais de Montbrun < >>> gerv...@demontbrun.com> wrote: >>> >>>> Hey All, >>>> >>>> Since updating to 4.0.x of oVirt, I have had an issue with my hosted >>>> engine. After a some poking around, I think I have figured out my issue and >>>> thought I would share to see what others think. >>>> The issue has existed with 4.0, 4.0.1, 4.0.2, 4.0.3, and still exists >>>> in 4.0.4. >>>> >>>> Description: >>>> When my hosted engine starts it reports that it is in a degraded state >>>> with 7 or 8 services still not started when I run systemctl status. It >>>> takes about 6 or 7 minutes to eventually start all the services and come >>>> online. If I don't set my cluster to Global-Maintenance mode it eventually >>>> thinks that my hosted-engine needs to be rebooted and restarts it before it >>>> can start everything. >>>> >>> >>> Could you please share with us logs gathered by ovirt-log-collector? >>> >>> It's just a guess but could you please take a look if you HE VM has >>> enough entropy? >>> >>> cat /proc/sys/kernel/random/entropy_avail >>> >>> If the value is low (below or around 200), you really need to install >>> and configure some entropy generator such as haveged >>> >>> >>>> Solution: >>>> I realized that Apache was the culprit and found that the proxy to the >>>> ovirt-engine in /etc/httpd/conf.d/z-ovirt-engine-proxy.conf has a >>>> super long timeout with many retries. I changed the settings and now >>>> everything works for me. >>>> >>>> -> Before change: >>>> >>>> <LocationMatch ^/(ovirt-engine($|/)|api($|/)| >>>> RHEVManagerWeb/|OvirtEngineWeb/|ca.crt$|engine.ssh.key.txt$| >>>> rhevm.ssh.key.txt$)> >>>> ProxyPassMatch ajp://127.0.0.1:8702 timeout=3600 retry=5 >>>> >>>> <IfModule deflate_module> >>>> AddOutputFilterByType DEFLATE text/javascript text/css >>>> text/html text/xml text/json application/xml application/json >>>> application/x-yaml >>>> </IfModule> >>>> </LocationMatch> >>>> >>>> >>>> -> After change: >>>> >>>> <LocationMatch ^/ovirt-engine($|/)> >>>> ProxyPassMatch ajp://127.0.0.1:8702 timeout=5 retry=2 >>>> >>>> <IfModule deflate_module> >>>> AddOutputFilterByType DEFLATE text/javascript text/css >>>> text/html text/xml text/json application/xml application/json >>>> application/x-yaml >>>> </IfModule> >>>> </LocationMatch> >>>> >>>> >>> This one is correct for 4.0 >>> , not sure why it was not updated during upgrade from 3.6. @Simone? >>> >>> >> >> Honestly it's >> <LocationMatch ^/ovirt-engine($|/)> >> ProxyPassMatch ajp://127.0.0.1:8702 timeout=3600 retry=5 >> >> <IfModule deflate_module> >> AddOutputFilterByType DEFLATE text/javascript text/css >> text/html text/xml text/json application/xml application/json >> application/x-yaml >> </IfModule> >> </LocationMatch> >> also on a fresh 4.0 engine from our latest engine-appliance. >> > > Right, I missed the timeout/retry option changes. But the important part > is why old configuration (with different LocationMatch) was not overwritten > during upgrade. > > I suspect that it could got overwritten a second time to its 3.6 value in our backup/restore procedure. Adding Didi here. > >> >>> >>>> If I read the timeout settings correctly, it will wait 60 minutes with >>>> 5 retries. 5 hours is way too long for my little server to hold onto all >>>> those apache processes. >>>> >>> The change I made allows for there to be an error, and also releases >>>> apache's hold on the process. Once everything is ready, apache is ready to >>>> serve requests and everything/everyone is happy. Before making the change, >>>> I just get a whitescreen in my browser and then nothing works until I >>>> restart Apache (or I end up in an endless loop of ovirt-ha services >>>> restarting my hosted-engine. >>>> >>> >>> Well, if you have an issue with too many apache processes waiting for >>> engine to respond, then there's some issue in engine. As I wrote above >>> please share the logs with us and check entropy. >>> >>> Thanks >>> >>> Martin Perina >>> >>> >>> >>>> >>>> I noticed that this setting reverts to the original setting, so oVirt >>>> must be writing this file. Perhaps these number can be changed in oVirt? If >>>> not, I will just setup and ansible play to revert the settings with working >>>> values and restart apache on my engine. >>>> :-) >>>> >>>> Cheers, >>>> Gervais >>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> Users mailing list >>>> Users@ovirt.org >>>> http://lists.ovirt.org/mailman/listinfo/users >>>> >>>> >>> >> >
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users