Docker Hub Rate Limit issue RESOLVED (was: [vpp-dev] Jenkins jobs UNSTABLE due to failure to upload logs to nexus.fd.io)

2020-11-19 Thread Dave Wallace

Folks,

The Docker Hub Rate Limit issue has been resolved (details below) and 
the FD.io CI jobs are operating normally.  Please let me know if you 
encounter any errant failure signatures.


Thanks to Vanessa & Trishan for their help resolving the outage.

Cheers,
-daw-

 %< 
Docker Hub Rate Limit Resolution:

It turns out I misunderstood their rate limiting scheme -- the limit is 
imposed on anonymous & unauthorized docker id based pull requests, not 
on the repository accounts. Therefore we needed to create an 
authenticated account, add it to the 'fdiotools' 'users' team and then 
configure Nomad to login with the docker id for pull requests from the 
'fdiotools' repositories in order to avoid the rate limit.


Vanessa & Trishan created an fd.io email account and docker account 
which were then added to all of the Jenkins.fd.io Nomad Plugin 
configuration templates for all FD.io projects.  Nomad is now 
successfully issuing docker pull requests and spinning up CI job 
executors at the request of jenkins.fd.io!


Life is good :)
 %< 

On 11/18/2020 6:43 PM, Dave Wallace via lists.fd.io wrote:

Folks,

IT-21051 was resolved by Vanessa's ci-management patch [0] while 
[nearly] simultaneously two patches [1] [2] from Andrew Y were 
deployed which remove the artifact publishing from the VPP CI jobs.  
These changes were subsequently reverted [3].


Operation of VPP CI jobs has been restored and I have done a 'recheck' 
on all gerrit changes which previously failed due to the UNSTABLE job 
completion status.


Unfortunately, there is a new issue caused by hitting the Docker Hub 
Pull limit [4] which is causing job allocations to fail and the 
jenkins build queue to back up.  I have opened a new LF Help Desk 
Ticket [4], sent an email to the TSC, and will bring this up in 
tomorrow's TSC meeting to get it resolved.


There also appears to be a similar issue with the 
vpp-csit-verify-device-master-1n-skx job which has jobs failing due to 
the inability to start containers.


Thank you for your patience during this outage and thanks to Vanessa & 
the entire LF-IT team who worked on identifying the fix to the log 
upload issue.  Also a big thank you to Andrew Yourtchenko for his 
assistance in pushing ci-management patches and Vratko for 
ci-management patch reviews.


-daw-

[0] https://gerrit.fd.io/r/c/ci-management/+/29986
[1] https://gerrit.fd.io/r/c/ci-management/+/29985
[2] https://gerrit.fd.io/r/c/ci-management/+/29987
[3] https://gerrit.fd.io/r/c/ci-management/+/29988
[4] 
https://jira.linuxfoundation.org/plugins/servlet/theme/portal/2/IT-21063


On 11/17/2020 12:38 PM, Dave Wallace via lists.fd.io wrote:

Folks,

There is an issue with CI jobs being marked as UNSTABLE due to the 
failure to upload log files to nexus.fd.io.  This is causing the CI 
job pipeline to be stalled due to checkstyle job not succeeding.


I have opened a case with LF-IT: 
https://jira.linuxfoundation.org/plugins/servlet/theme/portal/2/IT-21051


Thanks,
-daw-









-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#18102): https://lists.fd.io/g/vpp-dev/message/18102
Mute This Topic: https://lists.fd.io/mt/78377537/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: [vpp-dev] Jenkins jobs UNSTABLE due to failure to upload logs to nexus.fd.io

2020-11-18 Thread Dave Wallace

Folks,

IT-21051 was resolved by Vanessa's ci-management patch [0] while 
[nearly] simultaneously two patches [1] [2] from Andrew Y were deployed 
which remove the artifact publishing from the VPP CI jobs.  These 
changes were subsequently reverted [3].


Operation of VPP CI jobs has been restored and I have done a 'recheck' 
on all gerrit changes which previously failed due to the UNSTABLE job 
completion status.


Unfortunately, there is a new issue caused by hitting the Docker Hub 
Pull limit [4] which is causing job allocations to fail and the jenkins 
build queue to back up.  I have opened a new LF Help Desk Ticket [4], 
sent an email to the TSC, and will bring this up in tomorrow's TSC 
meeting to get it resolved.


There also appears to be a similar issue with the 
vpp-csit-verify-device-master-1n-skx job which has jobs failing due to 
the inability to start containers.


Thank you for your patience during this outage and thanks to Vanessa & 
the entire LF-IT team who worked on identifying the fix to the log 
upload issue.  Also a big thank you to Andrew Yourtchenko for his 
assistance in pushing ci-management patches and Vratko for ci-management 
patch reviews.


-daw-

[0] https://gerrit.fd.io/r/c/ci-management/+/29986
[1] https://gerrit.fd.io/r/c/ci-management/+/29985
[2] https://gerrit.fd.io/r/c/ci-management/+/29987
[3] https://gerrit.fd.io/r/c/ci-management/+/29988
[4] https://jira.linuxfoundation.org/plugins/servlet/theme/portal/2/IT-21063

On 11/17/2020 12:38 PM, Dave Wallace via lists.fd.io wrote:

Folks,

There is an issue with CI jobs being marked as UNSTABLE due to the 
failure to upload log files to nexus.fd.io.  This is causing the CI 
job pipeline to be stalled due to checkstyle job not succeeding.


I have opened a case with LF-IT: 
https://jira.linuxfoundation.org/plugins/servlet/theme/portal/2/IT-21051


Thanks,
-daw-






-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#18089): https://lists.fd.io/g/vpp-dev/message/18089
Mute This Topic: https://lists.fd.io/mt/78321469/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



[vpp-dev] Jenkins jobs UNSTABLE due to failure to upload logs to nexus.fd.io

2020-11-17 Thread Dave Wallace

Folks,

There is an issue with CI jobs being marked as UNSTABLE due to the 
failure to upload log files to nexus.fd.io.  This is causing the CI job 
pipeline to be stalled due to checkstyle job not succeeding.


I have opened a case with LF-IT: 
https://jira.linuxfoundation.org/plugins/servlet/theme/portal/2/IT-21051


Thanks,
-daw-

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#18066): https://lists.fd.io/g/vpp-dev/message/18066
Mute This Topic: https://lists.fd.io/mt/78321469/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-