See my response inline.
On Tue, Oct 18, 2016 at 6:07 AM, Dmitry Tantsur <dtant...@redhat.com> wrote:
> On 10/17/2016 11:10 PM, Wesley Hayutin wrote:
>> The RDO CI team is considering adding retries to our calls to
>> again .
>> This is very handy for bare metal environments where retries may be
>> needed due
>> to random chaos in the environment itself.
>> We're trying to balance two things here..
>> 1. reduce the number of false negatives in CI
>> 2. try not to overstep what CI should vs. what the product should do.
>> We would like to hear your comments if you think this is acceptable for
>> CI or if
>> this may be overstepping.
>> Thank you
>>  http://paste.openstack.org/show/586035/
> I probably lack some context of what exactly problems you face. I don't
> have any disagreement with retrying it, just want to make sure we're not
> missing actual bugs.
I agree, we have to be careful not to paper over bugs while we try to
overcome typical environmental delays that come w/ booting, rebooting $x
number of random hardware nodes.
To make this a little more crystal clear, I'm trying to determine is where
progressive delays and retries should be injected into the workflow of
deploying an overcloud.
Should we add options in the product itself that allow for $x number of
retries w/ a configurable set of delays for introspection?  Is the
expectation this works the first time everytime?
Are we overstepping what CI should do by implementing .
Additionally would it be appropriate to implement , while  is
developed for the next release and is it OK to use  with older releases?
Thanks for your time and responses.
OpenStack Development Mailing List (not for usage questions)