magnayn commented on Bug JENKINS-21487

So, I'm re-opening this as I can re-create the "white screen of death" regularly on our Jenkins instance, on the latest LTS (1.565.1).

I think this is strongly influenced by slow(ish) disk (we are ZFS on Linux), because it's often sitting relatively low levels of CPU. We have a reasonable number of jobs, but nothing to write home about.

Running the groovy script from above seems to show it spending extreme amounts of time in what looks like parsing XML files. To get the size, it's triggering the following bit of trace:

at java.io.FileInputStream.readBytes(Native Method)
at java.io.FileInputStream.read(FileInputStream.java:234)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read(BufferedInputStream.java:265)

  • locked <0x00000000f9b461f0> (a java.io.BufferedInputStream)
    at java.io.FilterInputStream.read(FilterInputStream.java:83)
    at java.io.PushbackInputStream.read(PushbackInputStream.java:139)
    at com.thoughtworks.xstream.core.util.XmlHeaderAwareReader.getHeader(XmlHeaderAwareReader.java:79)
    at com.thoughtworks.xstream.core.util.XmlHeaderAwareReader.<init>(XmlHeaderAwareReader.java:61)
    at com.thoughtworks.xstream.io.xml.AbstractXppDriver.createReader(AbstractXppDriver.java:65)
    at hudson.XmlFile.unmarshal(XmlFile.java:163)
    at hudson.model.Run.reload(Run.java:321)
    at hudson.model.Run.<init>(Run.java:309)
    at hudson.model.AbstractBuild.<init>(AbstractBuild.java:173)
    at hudson.maven.AbstractMavenBuild.<init>(AbstractMavenBuild.java:54)
    at hudson.maven.MavenModuleSetBuild.<init>(MavenModuleSetBuild.java:146)
    at sun.reflect.GeneratedConstructorAccessor58.newInstance(Unknown Source)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
    at jenkins.model.lazy.LazyBuildMixIn.loadBuild(LazyBuildMixIn.java:153)
    at jenkins.model.lazy.LazyBuildMixIn$1.create(LazyBuildMixIn.java:134)
    at hudson.model.RunMap.retrieve(RunMap.java:218)
    at hudson.model.RunMap.retrieve(RunMap.java:56)
    at jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:687)
  • locked <0x0000000082951908> (a hudson.model.RunMap)
    at jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:670)
    at jenkins.model.lazy.AbstractLazyLoadRunMap.getById(AbstractLazyLoadRunMap.java:542)
    at jenkins.model.lazy.BuildReferenceMapAdapter.unwrap(BuildReferenceMapAdapter.java:42)
    at jenkins.model.lazy.BuildReferenceMapAdapter.access$200(BuildReferenceMapAdapter.java:27)
    at jenkins.model.lazy.BuildReferenceMapAdapter$SetAdapter._unwrap(BuildReferenceMapAdapter.java:356)
    at jenkins.model.lazy.BuildReferenceMapAdapter$SetAdapter.access$400(BuildReferenceMapAdapter.java:248)
    at jenkins.model.lazy.BuildReferenceMapAdapter$SetAdapter$1.adapt(BuildReferenceMapAdapter.java:271)
    at jenkins.model.lazy.BuildReferenceMapAdapter$SetAdapter$1.adapt(BuildReferenceMapAdapter.java:269)
    at hudson.util.AdaptedIterator.next(AdaptedIterator.java:54)
    at com.google.common.collect.Iterators$7.computeNext(Iterators.java:648)
    at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
    at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
    at java.util.Collections$UnmodifiableCollection$1.hasNext(Collections.java:1101)
    at java.util.AbstractMap$2$1.hasNext(AbstractMap.java:392)
    at hudson.util.RunList.size(RunList.java:108)

That seems really, really expensive to perform when all the caller cares about is the size of the runlist. Regardless of anything else, that looks like an easy optimisation.

Similarly, the home page lockups seem to be because it spends a long time figuring out "what was the last successful / last failed" build, which looks to be similar in nature. I wonder if that could be cached to speed it up without having to load the entire dataset.

It's possible that it's memory pressure causing the system to dump this stuff and it having to reload. I'm going to try increasing Xmx and maybe getting some dumps to figure out what's happening.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira

--
You received this message because you are subscribed to the Google Groups "Jenkins Issues" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to