On Fri, Mar 29, 2019 at 11:34 AM Thanh Ha <[email protected]> wrote:
> On Fri, Mar 29, 2019 at 9:50 AM Robert Varga <[email protected]> wrote: > >> Hello, >> >> it seems verify-javadoc jobs are hitting some infra-related issue, and >> are resulting in unstable builds, with jenkins voting CR -1: >> >> >> https://jenkins.opendaylight.org/releng/job/openflowplugin-maven-javadoc-verify-sodium-openjdk8/142/ >> >> This seems to have started happening overnight ... >> > > I'm looking into this. I don't see anything obvious that was merged into > releng/builder between yesterday and today so I'm surprised if lftools is > causing breakage all of a sudden. > > Will update once I have more details. > I fixed the lftools error handling code with this patch https://gerrit.linuxfoundation.org/infra/15123 and we now get more useful error messages. ERROR: Failed to upload to Nexus with status code: 504 Gateway timeout. It looks like the Gateway timeout issue we saw last year is back: https://jira.linuxfoundation.org/browse/RELENG-1215 The issue is caused by too many tiny files being placed on the file system for Nexus logs, eventually at a certain point Nexus is unable to respond quickly enough to requests to push files. If I recall this only affects the logs volume since we separate logs from the rest of the artifact storage. javadoc jobs produce a significant amount of tiny files so eventually we hit this threshhold where it becomes a problem. Last time we resolved it by setting in place a job to clear out old metadata files that we weren't aware of in the logs directory (this is in addition to the pre-existing log cleanup job) but seems like maybe only bought us a little time. I think long term we should probably shop using Nexus as a log storage mechanism and use something like s3 buckets or OpenStack Swift for things like these. Short term I think there's a few things we could do: * Stop hosting javadoc html in the logs archive as part of the javadoc verify jobs. While this is a cool feature and handy to review generated javadocs from the job, I'm not sure how much it is actually being used by people and if not might be the simplest solution. * Host javadoc html as a single zip file in archives instead, an alternative solution * Once one of the above solutions are in place, clear out some old javadoc job logs to get Nexus back in a responsive shape. Regards, Thanh
_______________________________________________ infrastructure mailing list [email protected] https://lists.opendaylight.org/mailman/listinfo/infrastructure
