[OpenDaylight Infrastructure] [opendaylight.org #63123] [linuxfoundation.org #63123] Re: LISPflowmapping performance CSIT failing

2018-11-05 Thread Thanh Ha via RT
Hi Lori,

Sounds like a problem that might be difficult to sort out but I'll poke at
this today and see if I can find some clues.

Regards,
Thanh

On Mon, Nov 5, 2018 at 4:51 PM Lori Jakab 
wrote:

> [adding helpdesk, not sure who is monitoring infrastructure@]
>
> On Fri, Nov 2, 2018 at 2:02 PM Lori Jakab 
> wrote:
> >
> > Hi,
> >
> > For a while the lispflowmapping performance tests on Jenkins have been
> > failing, first intermittently, but now the Neon and Oxygen tests fail
> > almost always:
> >
> >
> https://jenkins.opendaylight.org/releng/view/lispflowmapping/job/lispflowmapping-csit-1node-performance-only-neon/
> >
> https://jenkins.opendaylight.org/releng/view/lispflowmapping/job/lispflowmapping-csit-1node-performance-only-fluorine/
> >
> https://jenkins.opendaylight.org/releng/view/lispflowmapping/job/lispflowmapping-csit-1node-performance-only-oxygen/
> >
> > This is the error message that I found most likely to be useful:
> > "Caused: java.io.IOException: Backing channel
> > 'prd-centos7-robot-2c-8g-42785' is disconnected." see the bottom of
> > the full console log:
> >
> >
> https://jenkins.opendaylight.org/releng/view/lispflowmapping/job/lispflowmapping-csit-1node-performance-only-neon/90/console
> >
> > Performance jobs use two VMs for the tests, and it looks like during
> > the tests the connection from the main VM to the slave is broken. I
> > couldn't find any clues for the root of the problem in these logs.
> >
> > Any ideas on how to fix this? Unless the problem is fixed, these tests
> > just waste infra resources, so the sensible thing to do would be to
> > disable them, which is not the outcome I would prefer. The only other
> > project that seems to still have performance tests is SXP, their tests
> > at least finish, but not without failures, so I don't know how much
> > they are affected by this issue. MDSAL used to have performance tests
> > too, but I cant find them anymore.
> >
> > -Lori
> ___
> infrastructure mailing list
> infrastructure@lists.opendaylight.org
> https://lists.opendaylight.org/mailman/listinfo/infrastructure
>

___
infrastructure mailing list
infrastructure@lists.opendaylight.org
https://lists.opendaylight.org/mailman/listinfo/infrastructure


[OpenDaylight Infrastructure] [opendaylight.org #67316] Jenkins broken?

2019-01-18 Thread Thanh Ha via RT
On Fri Jan 18 13:17:06 2019, n...@hq.sk wrote:
> Hello,
> 
> it seems our build infra is in a bad shape. About an hour ago we had
> build queue @200 jobs, with no reaction to the additional load:
> 
> https://jenkins.opendaylight.org/releng/label/centos7-builder-8c-
> 8g/load-statistics
> 
> While the load went down a bit, the jobs are effectively dead:
> 
> https://jenkins.opendaylight.org/releng/job/openflowplugin-validate-
> autorelease-oxygen/428/console
> is stuck at downloading a .pom file (which is couple of KB).
> 
> https://jenkins.opendaylight.org/releng/job/yangtools-maven-javadoc-
> verify-master/2832/console
> is going nowhere in 96 minutes, usually completing in 6 minutes.
> 
> Regards,
> Robert


CPU was unusually high and everything appeared to be stuck. We ended up kicking 
the Jenkins process, making sure all the VMs / SSH connections got cleared from 
Jenkins before allowing the jobs go go through.

CPU appears normal now and the few jobs I checked appear to be going.

Regards,
Thanh

___
infrastructure mailing list
infrastructure@lists.opendaylight.org
https://lists.opendaylight.org/mailman/listinfo/infrastructure


[OpenDaylight Infrastructure] [opendaylight.org #69767] ODL Jenkins/nexus outage

2019-03-13 Thread Thanh Ha via RT
On Wed Mar 13 08:05:35 2019, n...@hq.sk wrote:
> On 13/03/2019 13:01, Robert Varga wrote:
> > Hello,
> >
> > it seems our cloud is experiencing issues:
> >
> > https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/bgpcep-
> > maven-verify-sodium-mvn35-openjdk8/30/console.log.gz:
> >
> >> [WARNING] Could not transfer metadata
> >> org.opendaylight.controller:odl-mdsal-trace:1.10.0-SNAPSHOT/maven-
> >> metadata.xml from/to opendaylight-snapshot
> >> (https://nexus.opendaylight.org/content/repositories/opendaylight.snapshot/):
> >> Failed to transfer file:
> >> https://nexus.opendaylight.org/content/repositories/opendaylight.snapshot/org/opendaylight/controller/odl-
> >> mdsal-trace/1.10.0-SNAPSHOT/maven-metadata.xml. Return code is: 502
> >> , ReasonPhrase:Bad Gateway.
> >> [WARNING] Failure to transfer org.opendaylight.controller:odl-mdsal-
> >> trace:1.10.0-SNAPSHOT/maven-metadata.xml from
> >> https://nexus.opendaylight.org/content/repositories/opendaylight.snapshot/
> >> was cached in the local repository, resolution will not be
> >> reattempted until the update interval of opendaylight-snapshot has
> >> elapsed or updates are forced. Original error: Could not transfer
> >> metadata org.opendaylight.controller:odl-mdsal-trace:1.10.0-
> >> SNAPSHOT/maven-metadata.xml from/to opendaylight-snapshot
> >> (https://nexus.opendaylight.org/content/repositories/opendaylight.snapshot/):
> >> Failed to transfer file:
> >> https://nexus.opendaylight.org/content/repositories/opendaylight.snapshot/org/opendaylight/controller/odl-
> >> mdsal-trace/1.10.0-SNAPSHOT/maven-metadata.xml. Return code is: 502
> >> , ReasonPhrase:Bad Gateway.
> >
> > Furthermore:
> >
> > 1) we do not have 4c-4g slaves:
> >
> > https://jenkins.opendaylight.org/releng/label/centos7-builder-4c-
> > 4g/load-statistics
> >
> 
> Same goes for 4c-16g slaves:
> 
> https://jenkins.opendaylight.org/releng/label/centos7-autorelease-4c-
> 16g/load-statistics
> 
> which means autorelease-sodium-openjdk11 #40 has been sitting there
> for
> more than 2 hours.
> 
> Regards,
> Robert

Hi Everyone,

We are actively looking into this issue. What we know now is that provisioning 
is timing out for all new minions. We've contacted our cloud provider and they 
are investigating on their end.

In the meantime I'm trying to clear out all the orphaned systems that are not 
attached to Jenkins. I've temporarily put Jenkins into shutdown mode so that we 
can clear out the systems.

Will update once we know more.

Regards,
Thanh


___
infrastructure mailing list
infrastructure@lists.opendaylight.org
https://lists.opendaylight.org/mailman/listinfo/infrastructure


[OpenDaylight Infrastructure] [opendaylight.org #69767] ODL Jenkins/nexus outage

2019-03-13 Thread Thanh Ha via RT
On Wed Mar 13 09:52:59 2019, zxiiro wrote:
> On Wed Mar 13 08:05:35 2019, n...@hq.sk wrote:
> > On 13/03/2019 13:01, Robert Varga wrote:
> > > Hello,
> > >
> > > it seems our cloud is experiencing issues:
> > >
> > > https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/bgpcep-
> > > maven-verify-sodium-mvn35-openjdk8/30/console.log.gz:
> > >
> > >> [WARNING] Could not transfer metadata
> > >> org.opendaylight.controller:odl-mdsal-trace:1.10.0-SNAPSHOT/maven-
> > >> metadata.xml from/to opendaylight-snapshot
> > >> (https://nexus.opendaylight.org/content/repositories/opendaylight.snapshot/):
> > >> Failed to transfer file:
> > >> https://nexus.opendaylight.org/content/repositories/opendaylight.snapshot/org/opendaylight/controller/odl-
> > >> mdsal-trace/1.10.0-SNAPSHOT/maven-metadata.xml. Return code is:
> > >> 502
> > >> , ReasonPhrase:Bad Gateway.
> > >> [WARNING] Failure to transfer org.opendaylight.controller:odl-
> > >> mdsal-
> > >> trace:1.10.0-SNAPSHOT/maven-metadata.xml from
> > >> https://nexus.opendaylight.org/content/repositories/opendaylight.snapshot/
> > >> was cached in the local repository, resolution will not be
> > >> reattempted until the update interval of opendaylight-snapshot has
> > >> elapsed or updates are forced. Original error: Could not transfer
> > >> metadata org.opendaylight.controller:odl-mdsal-trace:1.10.0-
> > >> SNAPSHOT/maven-metadata.xml from/to opendaylight-snapshot
> > >> (https://nexus.opendaylight.org/content/repositories/opendaylight.snapshot/):
> > >> Failed to transfer file:
> > >> https://nexus.opendaylight.org/content/repositories/opendaylight.snapshot/org/opendaylight/controller/odl-
> > >> mdsal-trace/1.10.0-SNAPSHOT/maven-metadata.xml. Return code is:
> > >> 502
> > >> , ReasonPhrase:Bad Gateway.
> > >
> > > Furthermore:
> > >
> > > 1) we do not have 4c-4g slaves:
> > >
> > > https://jenkins.opendaylight.org/releng/label/centos7-builder-4c-
> > > 4g/load-statistics
> > >
> >
> > Same goes for 4c-16g slaves:
> >
> > https://jenkins.opendaylight.org/releng/label/centos7-autorelease-4c-
> > 16g/load-statistics
> >
> > which means autorelease-sodium-openjdk11 #40 has been sitting there
> > for
> > more than 2 hours.
> >
> > Regards,
> > Robert
> 
> Hi Everyone,
> 
> We are actively looking into this issue. What we know now is that
> provisioning is timing out for all new minions. We've contacted our
> cloud provider and they are investigating on their end.
> 
> In the meantime I'm trying to clear out all the orphaned systems that
> are not attached to Jenkins. I've temporarily put Jenkins into
> shutdown mode so that we can clear out the systems.
> 
> Will update once we know more.

Hi Everyone,

Our cloud provider's glance system got choked up and was having response 
issues. They deployed additional glance workers and things have cleared up. The 
queue is processing now at this point.

Apologies for the inconvenience.

Regards,
Thanh

___
infrastructure mailing list
infrastructure@lists.opendaylight.org
https://lists.opendaylight.org/mailman/listinfo/infrastructure


Re: [OpenDaylight Infrastructure] [opendaylight.org #67157] Re: javadoc job failure since openjdk11 is enabled on verify jobs

2019-02-12 Thread Thanh Ha via RT
On Tue, Feb 12, 2019 at 3:36 PM Lori Jakab  wrote:

> On Thu, Jan 17, 2019 at 2:20 PM Anil Belur via RT
>  wrote:
> >
> > On Thu Jan 17 01:13:50 2019, askb wrote:
> > > On Wed Jan 16 13:31:24 2019, lorand.ja...@gmail.com wrote:
> > > > Adding helpdesk.
> > > >
> > > > I suppose SET_JDK_VERSION should not be an array.
> > > >
> > > > -Lori
> > >
> > > Greetings Lori:
> > >
> > > This approach for passing multiple java-versions (openjdk8 and
> > > openjdk11) seem
> > > to work for the maven-verify jobs but not for the javadocs-verify
> > > jobs. This is because JJB ${job-name} expansion is handling the
> > > ${java-version} when the job name has a variable name included. When
> > > the job name does not have the ${java-version} this is passed as a
> > > list to the job.
> > >
> > > I think we may need to make the scripts a little intelligence to
> > > handle these scenarios or simply fix this in the job which requires to
> > > be changed in global-jjb. I've have noted this in Jira and will work
> > > on fixing this in global-jjb to make sure this works properly for
> > > other jobs too.
> > >
> > > https://jira.linuxfoundation.org/browse/RELENG-1648
> > >
> > > Thanks,
> > > Anil
> >
> > Hello Lori:
> >
> > The below change in global-jjb should fix the issue with javadocs jobs.
> >
> > https://gerrit.linuxfoundation.org/infra/#/c/14233/
>
> This change has been merged for quite a while now, but I don't know if
> it's active. Out javadoc job still fails with the same symptoms. See
> this for a recent example:
>
>
> https://jenkins.opendaylight.org/releng/job/lispflowmapping-maven-javadoc-verify-sodium/2/console
>
> 18:48:29 JAVA_HOME=/usr/lib/jvm/java-1.['8', '11'].0-openjdk
>
> Can you please take look?


Hi Lori,

Sorry I did not know this was still an issue for you. This patch:

https://git.opendaylight.org/gerrit/80299

Should resolve the issue for you as it only passes ther java-version list
to the job group that needs it without also applying it to the javadoc jobs.

Hope this helps,
Thanh

___
infrastructure mailing list
infrastructure@lists.opendaylight.org
https://lists.opendaylight.org/mailman/listinfo/infrastructure