I could not reproduce the error on an EC2 g3x8 instance making it hard to
debug. I also suspect it was due to resource usage limit on ci   Instance.

On Mon, Oct 1, 2018 at 10:40 PM Pedro Larroy <pedro.larroy.li...@gmail.com>
wrote:

> It doesn't look like flakiness to me at first sight. I think it might be
> related to resource usage / allocation / leak in the worst case.
>
> Could be that there was not enough memory GPU memory at the time of test
> execution. But I'm just speculating, hence my original question.
>
> Pedro.
>
> On Mon, Oct 1, 2018 at 8:16 PM Lin Yuan <apefor...@gmail.com> wrote:
>
> > Hi Pedro,
> >
> > I also got this failure in my PR
> >
> >
> http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/PR-11742/27/pipeline
> >
> > I was not able to identify the root cause of it from changelist. Are you
> > suggesting there is some flakiness in the master branch too?
> >
> > Thanks,
> >
> > Lin
> >
> > On Mon, Oct 1, 2018 at 4:55 PM Pedro Larroy <
> pedro.larroy.li...@gmail.com>
> > wrote:
> >
> > > Hi
> > >
> > > I saw this failure on CI:
> > >
> > >
> >
> http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/master/1697/pipeline
> > >
> > > Have you seen other cases where we fail to select the best CUDNN
> > algorithm?
> > > In which circumstances this could happen, and do you think is a good
> idea
> > > to have one selected by default as a last resort?
> > >
> > >
> > > Pedro.
> > >
> >
>

Reply via email to