Re: Failsafe: Killing self fork JVM. PING timeout elapsed.

Jason Young Wed, 13 Mar 2019 16:08:59 -0700

I upgraded failsafe and surefire to 3.0.0-M3 as advised; we encountered the
same exception. (Still using -Xmx5g, will switch to OpenJ9 soon in case
that helps.)


BTW I also asked on StackOverflow previously, for anyone interested:
https://stackoverflow.com/questions/54755846/killing-self-fork-jvm-ping-timeout-elapsed

On Tue, Feb 26, 2019 at 6:40 PM Jason Young <jason.yo...@procentive.com>
wrote:

> Thanks again for the information.
>
> We had increased the RAM to 3g some time ago to prevent OOMEs. More
> recently, I increased the RAM again to 5g for extra headroom since we had
> more headroom available; the problem hasn't happened since, but it hasn't
> been very long.
>
> We use a more customized image based on Alpine 3.8.2. The JDK and Maven
> are obtained via apk.
>
> I will try upgrading failsafe (and surefire while I'm at it) sooner, and
> probably do some experimentation with JVMs another time (not pressing for
> me ATM).
>
> On Tue, Feb 26, 2019 at 12:20 PM Tibor Digana <tibordig...@apache.org>
> wrote:
>
>> >> I'll try to enable some logging about GC pauses to see what's up
>>
>> Pls do not keep such setting after tuning the GC because this may sometime
>> break the interprocess communication between Maven process and surefire
>> process.
>> It's worth to list GC information in a file and not in the console logs.
>> This can be configured, I guess.
>>
>> >> Do you think the value is simply too low?
>>
>> GCing many objects may take some time and I remember we had a user who had
>> this problem a year or two ago.
>> We check every third NOOP (which is 3 x 10 sec) as a fix instead of every
>> NOP. So 30 seconds looked satisfactory.
>> I think you use old version 2.20 or something like that. The fixes for
>> docker have been done so far, so please use the latest version 3.0.0-M3.
>> See this page
>> https://maven.apache.org/surefire/maven-surefire-plugin/docker.html, we
>> used maven:3.5.3-jdk-8-alpine in this test. Which base image did you use?
>>
>> Cheers
>> Tibor
>>
>> On Tue, Feb 26, 2019 at 5:24 PM Jason Young <jason.yo...@procentive.com>
>> wrote:
>>
>> > Thanks for the information. It's good to see someone understands a
>> little
>> > about this.
>> >
>> > Incidentally, we have been looking at other GCs and VMs for the
>> application
>> > in production environments, so I'll look into how these affect tests as
>> > well. I'll try to enable some logging about GC pauses to see what's up.
>> >
>> > How would `-Xmx3g` cause long GC cycles? Do you think the value is
>> simply
>> > too low?
>> >
>> > FWIW we're running the Maven build in an Alpine-based Docker container.
>> >
>> > On Sat, Feb 23, 2019 at 6:36 AM Tibor Digana <tibordig...@apache.org>
>> > wrote:
>> >
>> > > Hi Jason,
>> > >
>> > > We spoke about this issue on our chat in ASF Slack:
>> > > "I think his tests have been paused for a long GC periods and timed
>> out
>> > 3x
>> > > PING period = 30 seconds. After this period forked JVM supposed the
>> Maven
>> > > process was killed by JenkinsCI and therefore all surefire processes
>> are
>> > > killed as well and all the file handlers and memory consumptions are
>> > > freed."
>> > >
>> > > "But I have to say that `-Xmx3g` may cause long GC cycles, see
>> > >
>> > >
>> >
>> https://maven.apache.org/surefire/maven-surefire-plugin/examples/shutdown.html
>> > > "
>> > >
>> > > You are using java-1.8-openjdk. I guess you should use Shenandoah GC
>> > which
>> > > is an experimental algorithm in  JVM 1.8. This would significantly
>> short
>> > > the GC cycles.
>> > >
>> > > We should of cource provide a new configuration parameter to give you
>> a
>> > > chance to prolong the PING.
>> > >
>> > > Cheers
>> > > Tibor
>> > >
>> >
>> >
>> > --
>> >
>> > Jason Young
>> >
>>
>
>

Re: Failsafe: Killing self fork JVM. PING timeout elapsed.

Reply via email to