On Wed, Jan 31, 2018 at 10:14 AM, Anda Nicolae <anico...@lenovo.com> wrote: > I've let the deployment run overnight and it failed after almost 4hrs with > the errors below. Do you happen to know the config file where I can decrease > the timeout? I looked in /etc/nova/nova.conf and in ironic config files but I > couldn't find anything relevant.
This is the default timeout for the heat stack which creates the nodes via nova. It is set via the --timeout option to the openstack overcloud deploy command, you can set it to a lower value which will mean your stalled deployment fails faster. > The errors are: > > [overcloud.Compute.0]: CREATE_FAILED ResourceInError: > resources[0].resources.NovaCompute: Went to status ERROR due to "Message: > Unknown, Code: Unknown" > [overcloud.Controller.0]: CREATE_FAILED Resource CREATE failed: > ResourceInError: resources.Controller: Went to status ERROR due to "Message: > No valid host was found. There are not enough hosts available., Code: 500" > > It is unclear to me why the above errors occur, since in my instackenv.json I > declared node capabilities for both the computer and the controller node to > be greater than the compute and controller flavors from 'openstack flavor > list'. > > However, I've found this link and I am looking over it: > https://docs.openstack.org/ironic/latest/admin/troubleshooting.html#nova-returns-no-valid-host-was-found-error I would check the nova-scheduler.log on the undercloud (/var/log/nova/nova-scheduler.log) and look carefully at the state of the ironic nodes - basically this means the nova-scheduler can't find any nodes to deploy to (or not enough to satisfy the node counts requested), which can have various causes such as bad node state (not available and/or in maintenance mode), problems with power control (e.g nodes not responsive via ipmi etc), or mistakes with node tagging, profile matching or flavors. The nova-scheduler log is fairly noisy, but if you look carefully often you will find some clues as to why the nodes aren't being started correctly. Also check the status of the nova & ironic services on the undercloud to make sure e.g ironic-conductor hasn't failed for any reason. Hope that helps, Steve Hardy > > Thanks, > Anda > > -----Original Message----- > From: Raoul Scarazzini [mailto:ra...@redhat.com] > Sent: Tuesday, January 30, 2018 8:17 PM > To: Anda Nicolae; users@lists.rdoproject.org > Subject: Re: [rdo-users] RHOSP 10 failed overcloud deployment > > On 01/30/2018 04:39 PM, Anda Nicolae wrote: >> Got it. >> >> I've noticed that it spends quite some time in CREATE_IN_PROGRESS state for >> OS::Heat::ResourceGroup resource (on Controller node). >> Overcloud deployment fails after 4h. I will check in which config file is >> the overcloud deployment timeout configured and decrease it. >> >> Thanks, >> Anda > > Check also network settings. 4h timeout is the default when something is > unreachable. > > -- > Raoul Scarazzini > ra...@redhat.com > _______________________________________________ > users mailing list > users@lists.rdoproject.org > http://lists.rdoproject.org/mailman/listinfo/users > > To unsubscribe: users-unsubscr...@lists.rdoproject.org _______________________________________________ users mailing list users@lists.rdoproject.org http://lists.rdoproject.org/mailman/listinfo/users To unsubscribe: users-unsubscr...@lists.rdoproject.org