On Tue, Jan 26, 2016 at 11:42 AM, Stanislaw Bogatkin <sbogat...@mirantis.com> wrote: > Hi guys, > > for some time we have a bug [0] with ntpdate. It doesn't reproduced 100% of > time, but breaks our BVT and swarm tests. There is no exact point where > problem root located. To better understand this, some verbosity to ntpdate > output was added but in logs we can see only that packet exchange between > ntpdate and server was started and was never completed. >
So when I've hit this in my local environments there is usually one or two possible causes for this. 1) lack of network connectivity so ntp server never responds or 2) the stratum is too high. My assumption is that we're running into #2 because of our revert-resume in testing. When we resume, the ntp server on the master may take a while to become stable. This sync in the deployment uses the fuel master for synchronization so if the stratum is too high, it will fail with this lovely useless error. My assumption on what is happening is that because we aren't using a set of internal ntp servers but rather relying on the standard ntp.org pools. So when the master is being resumed it's struggling to find a good enough set of servers so it takes a while to sync. This then causes these deployment tasks to fail because the master has not yet stabilized (might also be geolocation related). We could either address this by fudging the stratum on the master server in the configs or possibly introducing our own more stable local ntp servers. I have a feeling fudging the stratum might be better when we only use the master in our ntp configuration. > As this bug is blocker, I propose to merge [1] to better understanding > what's going on. I created custom ISO with this patchset and tried to run > about 10 BVT tests on this ISO. Absolutely with no luck. So, if we will > merge this, we would catch the problem much faster and understand root > cause. > I think we should merge the increased logging patch anyway because it'll be useful in troubleshooting but we also might want to look into getting an ntp peers list added into the snapshot. > I appreciate your answers, folks. > > > [0] https://bugs.launchpad.net/fuel/+bug/1533082 > [1] https://review.openstack.org/#/c/271219/ > -- > with best regards, > Stan. > Thanks, -Alex __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev