Re: [JENKINS] Lucene-9.x-MacOSX (64bit/jdk-18) - Build # 978 - Unstable!
A test timed out. I've beasted with the same settings but can't reproduce. Either JVM bug somewhere or cosmic interference... Dawid On Wed, Aug 24, 2022 at 3:32 AM Policeman Jenkins Server wrote: > > Build: https://jenkins.thetaphi.de/job/Lucene-9.x-MacOSX/978/ > Java: 64bit/jdk-18 -XX:+UseCompressedOops -XX:+UseSerialGC > > 2 tests failed. > FAILED: > org.apache.lucene.analysis.ko.TestKoreanReadingFormFilter.testRandomData > > Error Message: > java.lang.Exception: Test abandoned because suite timeout was reached. > > Stack Trace: > java.lang.Exception: Test abandoned because suite timeout was reached. > at __randomizedtesting.SeedInfo.seed([9AA6F3EBA279C5BA]:0) > > > FAILED: org.apache.lucene.analysis.ko.TestKoreanReadingFormFilter.classMethod > > Error Message: > java.lang.Exception: Suite timeout exceeded (>= 720 msec). > > Stack Trace: > java.lang.Exception: Suite timeout exceeded (>= 720 msec). > at __randomizedtesting.SeedInfo.seed([9AA6F3EBA279C5BA]:0) > > - > To unsubscribe, e-mail: builds-unsubscr...@lucene.apache.org > For additional commands, e-mail: builds-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-9.x-MacOSX (64bit/jdk-18) - Build # 978 - Unstable!
Hi Dawid, I looked at this and also https://github.com/apache/lucene/issues/7687 If you look at the instances and how sporadic they are, the problem could be caused by TimeoutSuite using wall-clock time in com.carrotsearch.randomizedtesting? Especially in virtual machines, wall-clock time can be extremely inaccurate when you spin them up, then there's a big correction (via NTP or VM agent). I have no proof this is what is happening, except to say, I think it would be better if randomizedtesting used monotonic time (nanoTime) rather than wall-clock time (currentTimeMillis). It would make it more robust. On Wed, Aug 24, 2022 at 4:48 AM Dawid Weiss wrote: > > A test timed out. I've beasted with the same settings but can't > reproduce. Either JVM bug somewhere or cosmic interference... > > Dawid > > On Wed, Aug 24, 2022 at 3:32 AM Policeman Jenkins Server > wrote: > > > > Build: https://jenkins.thetaphi.de/job/Lucene-9.x-MacOSX/978/ > > Java: 64bit/jdk-18 -XX:+UseCompressedOops -XX:+UseSerialGC > > > > 2 tests failed. > > FAILED: > > org.apache.lucene.analysis.ko.TestKoreanReadingFormFilter.testRandomData > > > > Error Message: > > java.lang.Exception: Test abandoned because suite timeout was reached. > > > > Stack Trace: > > java.lang.Exception: Test abandoned because suite timeout was reached. > > at __randomizedtesting.SeedInfo.seed([9AA6F3EBA279C5BA]:0) > > > > > > FAILED: > > org.apache.lucene.analysis.ko.TestKoreanReadingFormFilter.classMethod > > > > Error Message: > > java.lang.Exception: Suite timeout exceeded (>= 720 msec). > > > > Stack Trace: > > java.lang.Exception: Suite timeout exceeded (>= 720 msec). > > at __randomizedtesting.SeedInfo.seed([9AA6F3EBA279C5BA]:0) > > > > - > > To unsubscribe, e-mail: builds-unsubscr...@lucene.apache.org > > For additional commands, e-mail: builds-h...@lucene.apache.org > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-9.x-MacOSX (64bit/jdk-18) - Build # 978 - Unstable!
Damn. I know about it but never had it happen to me. You're right in that it could be a reason and it's definitely one of the aspects I can take off the checklist. It looks strange because those timeouts are fairly high - the time correction would indeed have to be significant for this to fail (and in the middle of the process?!). Anyway, I'll look into this - thanks for the pointer! Dawid On Wed, Aug 24, 2022 at 1:39 PM Robert Muir wrote: > > Hi Dawid, I looked at this and also > https://github.com/apache/lucene/issues/7687 > > If you look at the instances and how sporadic they are, the problem > could be caused by TimeoutSuite using wall-clock time in > com.carrotsearch.randomizedtesting? Especially in virtual machines, > wall-clock time can be extremely inaccurate when you spin them up, > then there's a big correction (via NTP or VM agent). > > I have no proof this is what is happening, except to say, I think it > would be better if randomizedtesting used monotonic time (nanoTime) > rather than wall-clock time (currentTimeMillis). It would make it more > robust. > > > On Wed, Aug 24, 2022 at 4:48 AM Dawid Weiss wrote: > > > > A test timed out. I've beasted with the same settings but can't > > reproduce. Either JVM bug somewhere or cosmic interference... > > > > Dawid > > > > On Wed, Aug 24, 2022 at 3:32 AM Policeman Jenkins Server > > wrote: > > > > > > Build: https://jenkins.thetaphi.de/job/Lucene-9.x-MacOSX/978/ > > > Java: 64bit/jdk-18 -XX:+UseCompressedOops -XX:+UseSerialGC > > > > > > 2 tests failed. > > > FAILED: > > > org.apache.lucene.analysis.ko.TestKoreanReadingFormFilter.testRandomData > > > > > > Error Message: > > > java.lang.Exception: Test abandoned because suite timeout was reached. > > > > > > Stack Trace: > > > java.lang.Exception: Test abandoned because suite timeout was reached. > > > at __randomizedtesting.SeedInfo.seed([9AA6F3EBA279C5BA]:0) > > > > > > > > > FAILED: > > > org.apache.lucene.analysis.ko.TestKoreanReadingFormFilter.classMethod > > > > > > Error Message: > > > java.lang.Exception: Suite timeout exceeded (>= 720 msec). > > > > > > Stack Trace: > > > java.lang.Exception: Suite timeout exceeded (>= 720 msec). > > > at __randomizedtesting.SeedInfo.seed([9AA6F3EBA279C5BA]:0) > > > > > > - > > > To unsubscribe, e-mail: builds-unsubscr...@lucene.apache.org > > > For additional commands, e-mail: builds-h...@lucene.apache.org > > > > - > > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > > For additional commands, e-mail: dev-h...@lucene.apache.org > > > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-9.x-MacOSX (64bit/jdk-18) - Build # 978 - Unstable!
If we look at the 7687 issue, there's definitely some that can be explained by unruly tests randomly behaving badly. But a few of those (such as simple stemmer tests) look suspicious to me. I've fought the issue with my own tests (non-java) and its amazing how much stuff can break, if it relies on wall-clock time and the clock gets stepped. I'm talking about basic 20-year old mature C code too :) It is also surprising how large these clock corrections can be with virtual machines. To really confirm it, we'd need "system logs" as well to correlate the NTP activity with the failure. With virtualbox jenkins builds, I do this by enabling a serial console to file, and configure syslog to log to /dev/console. And this "system log file" is just another artifact that jenkins saves away for debugging. That's how i found the problem in my own tests. On Wed, Aug 24, 2022 at 9:08 AM Dawid Weiss wrote: > > Damn. I know about it but never had it happen to me. You're right in > that it could be a reason and it's definitely one of the aspects I can > take off the checklist. It looks strange because those timeouts are > fairly high - the time correction would indeed have to be significant > for this to fail (and in the middle of the process?!). Anyway, I'll > look into this - thanks for the pointer! > > Dawid > > On Wed, Aug 24, 2022 at 1:39 PM Robert Muir wrote: > > > > Hi Dawid, I looked at this and also > > https://github.com/apache/lucene/issues/7687 > > > > If you look at the instances and how sporadic they are, the problem > > could be caused by TimeoutSuite using wall-clock time in > > com.carrotsearch.randomizedtesting? Especially in virtual machines, > > wall-clock time can be extremely inaccurate when you spin them up, > > then there's a big correction (via NTP or VM agent). > > > > I have no proof this is what is happening, except to say, I think it > > would be better if randomizedtesting used monotonic time (nanoTime) > > rather than wall-clock time (currentTimeMillis). It would make it more > > robust. > > > > > > On Wed, Aug 24, 2022 at 4:48 AM Dawid Weiss wrote: > > > > > > A test timed out. I've beasted with the same settings but can't > > > reproduce. Either JVM bug somewhere or cosmic interference... > > > > > > Dawid > > > > > > On Wed, Aug 24, 2022 at 3:32 AM Policeman Jenkins Server > > > wrote: > > > > > > > > Build: https://jenkins.thetaphi.de/job/Lucene-9.x-MacOSX/978/ > > > > Java: 64bit/jdk-18 -XX:+UseCompressedOops -XX:+UseSerialGC > > > > > > > > 2 tests failed. > > > > FAILED: > > > > org.apache.lucene.analysis.ko.TestKoreanReadingFormFilter.testRandomData > > > > > > > > Error Message: > > > > java.lang.Exception: Test abandoned because suite timeout was reached. > > > > > > > > Stack Trace: > > > > java.lang.Exception: Test abandoned because suite timeout was reached. > > > > at __randomizedtesting.SeedInfo.seed([9AA6F3EBA279C5BA]:0) > > > > > > > > > > > > FAILED: > > > > org.apache.lucene.analysis.ko.TestKoreanReadingFormFilter.classMethod > > > > > > > > Error Message: > > > > java.lang.Exception: Suite timeout exceeded (>= 720 msec). > > > > > > > > Stack Trace: > > > > java.lang.Exception: Suite timeout exceeded (>= 720 msec). > > > > at __randomizedtesting.SeedInfo.seed([9AA6F3EBA279C5BA]:0) > > > > > > > > - > > > > To unsubscribe, e-mail: builds-unsubscr...@lucene.apache.org > > > > For additional commands, e-mail: builds-h...@lucene.apache.org > > > > > > - > > > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > > > For additional commands, e-mail: dev-h...@lucene.apache.org > > > > > > > - > > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > > For additional commands, e-mail: dev-h...@lucene.apache.org > > > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-9.x-MacOSX (64bit/jdk-18) - Build # 978 - Unstable!
Hi, this is the MacOS virtualbox. This one often hast timeshifts caused by Virtualbox and the NTP daemon of OSX is bullshit (no chrony). Actually earlier versions of MacOS had a bug in their OS libc segfaulting the app to crash on backwards jumps of wall time, which was fixed a few years ago. Now it looks like sometimes only Gradle/Java hangs because of this. Macos and backwards-jumping time do not fit well! Maybe a reason why Apple does not like their OS virtualized :-) Their bullshit kernel only works for 100% INTEL CPUs with all hardware behaving exactly in order to time. Uwe Am 24.08.2022 um 15:07 schrieb Dawid Weiss: Damn. I know about it but never had it happen to me. You're right in that it could be a reason and it's definitely one of the aspects I can take off the checklist. It looks strange because those timeouts are fairly high - the time correction would indeed have to be significant for this to fail (and in the middle of the process?!). Anyway, I'll look into this - thanks for the pointer! Dawid On Wed, Aug 24, 2022 at 1:39 PM Robert Muir wrote: Hi Dawid, I looked at this and also https://github.com/apache/lucene/issues/7687 If you look at the instances and how sporadic they are, the problem could be caused by TimeoutSuite using wall-clock time in com.carrotsearch.randomizedtesting? Especially in virtual machines, wall-clock time can be extremely inaccurate when you spin them up, then there's a big correction (via NTP or VM agent). I have no proof this is what is happening, except to say, I think it would be better if randomizedtesting used monotonic time (nanoTime) rather than wall-clock time (currentTimeMillis). It would make it more robust. On Wed, Aug 24, 2022 at 4:48 AM Dawid Weiss wrote: A test timed out. I've beasted with the same settings but can't reproduce. Either JVM bug somewhere or cosmic interference... Dawid On Wed, Aug 24, 2022 at 3:32 AM Policeman Jenkins Server wrote: Build: https://jenkins.thetaphi.de/job/Lucene-9.x-MacOSX/978/ Java: 64bit/jdk-18 -XX:+UseCompressedOops -XX:+UseSerialGC 2 tests failed. FAILED: org.apache.lucene.analysis.ko.TestKoreanReadingFormFilter.testRandomData Error Message: java.lang.Exception: Test abandoned because suite timeout was reached. Stack Trace: java.lang.Exception: Test abandoned because suite timeout was reached. at __randomizedtesting.SeedInfo.seed([9AA6F3EBA279C5BA]:0) FAILED: org.apache.lucene.analysis.ko.TestKoreanReadingFormFilter.classMethod Error Message: java.lang.Exception: Suite timeout exceeded (>= 720 msec). Stack Trace: java.lang.Exception: Suite timeout exceeded (>= 720 msec). at __randomizedtesting.SeedInfo.seed([9AA6F3EBA279C5BA]:0) - To unsubscribe, e-mail: builds-unsubscr...@lucene.apache.org For additional commands, e-mail: builds-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail: u...@thetaphi.de - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [ANNOUNCE] Issue migration Jira to GitHub starts on Monday, August 22
Issue migration has been completed (except for minor cleanups). This is the Jira -> GitHub issue number mapping for possible future usage. https://github.com/apache/lucene-jira-archive/blob/main/migration/mappings-data/issue-map.csv.20220823_final GitHub issue is now fully available for all issues. For issue label management (e.g. "fix-version"), please review this manual. https://github.com/apache/lucene/blob/main/dev-docs/github-issues-howto.md Tomoko 2022年8月22日(月) 19:46 Michael McCandless : > Wooot! Thank you so much Tomoko!! > > Mike > > On Mon, Aug 22, 2022 at 6:44 AM Tomoko Uchida < > tomoko.uchida.1...@gmail.com> wrote: > >> >> >> Issue migration has been started. Jira is now read-only. >> >> GitHub issue is available for new issues. >> >> - You should open new issues on GitHub. E.g. >> https://github.com/apache/lucene/issues/1078 >> - Do not touch issues that are in the middle of migration, please. E.g. >> https://github.com/apache/lucene/issues/1072 >> - While you cannot break these issues, migration scripts can >> modify/overwrite your comments on the issues. >> - Pull requests are not affected. You can open/update PRs as usual. >> Please let me know if you have any trouble with PRs. >> >> >> Tomoko >> >> >> 2022年8月18日(木) 18:23 Tomoko Uchida : >> >>> Hello all, >>> >>> The Lucene project decided to move our issue tracking system from Jira >>> to GitHub and migrate all Jira issues to GitHub. >>> >>> We start issue migration on Monday, August 22 at 8:00 UTC. >>> 1) We make Jira read-only before migration. You cannot update existing >>> issues until the migration is completed. >>> 2) You can use GitHub for opening NEW issues or pull requests during >>> migration. >>> >>> Note that issues should be raised in Jira at this moment, although >>> GitHub issue is already enabled in the Lucene repository. >>> Please do not raise issues in GitHub until we let you know that GitHub >>> issue is officially available. We immediately close any issues on GitHub >>> until then. >>> >>> Here are the detailed plan/migration steps. >>> https://github.com/apache/lucene-jira-archive/issues/7 >>> >>> Tomoko >>> >> -- > Mike McCandless > > http://blog.mikemccandless.com >
Re: [ANNOUNCE] Issue migration Jira to GitHub starts on Monday, August 22
Thanks! It seems to be working nicely. Question about the fix-version: tagging. I wonder if going forward we want to main that for new issues? I happened to notice there is also this "milestone" feature in github -- does that seem like a place to put version information? On Wed, Aug 24, 2022 at 3:20 PM Tomoko Uchida wrote: > > > > Issue migration has been completed (except for minor cleanups). > This is the Jira -> GitHub issue number mapping for possible future usage. > https://github.com/apache/lucene-jira-archive/blob/main/migration/mappings-data/issue-map.csv.20220823_final > > GitHub issue is now fully available for all issues. > For issue label management (e.g. "fix-version"), please review this manual. > https://github.com/apache/lucene/blob/main/dev-docs/github-issues-howto.md > > Tomoko > > > 2022年8月22日(月) 19:46 Michael McCandless : >> >> Wooot! Thank you so much Tomoko!! >> >> Mike >> >> On Mon, Aug 22, 2022 at 6:44 AM Tomoko Uchida >> wrote: >>> >>> >>> >>> Issue migration has been started. Jira is now read-only. >>> >>> GitHub issue is available for new issues. >>> >>> - You should open new issues on GitHub. E.g. >>> https://github.com/apache/lucene/issues/1078 >>> - Do not touch issues that are in the middle of migration, please. E.g. >>> https://github.com/apache/lucene/issues/1072 >>> - While you cannot break these issues, migration scripts can >>> modify/overwrite your comments on the issues. >>> - Pull requests are not affected. You can open/update PRs as usual. Please >>> let me know if you have any trouble with PRs. >>> >>> >>> Tomoko >>> >>> >>> 2022年8月18日(木) 18:23 Tomoko Uchida : Hello all, The Lucene project decided to move our issue tracking system from Jira to GitHub and migrate all Jira issues to GitHub. We start issue migration on Monday, August 22 at 8:00 UTC. 1) We make Jira read-only before migration. You cannot update existing issues until the migration is completed. 2) You can use GitHub for opening NEW issues or pull requests during migration. Note that issues should be raised in Jira at this moment, although GitHub issue is already enabled in the Lucene repository. Please do not raise issues in GitHub until we let you know that GitHub issue is officially available. We immediately close any issues on GitHub until then. Here are the detailed plan/migration steps. https://github.com/apache/lucene-jira-archive/issues/7 Tomoko >> >> -- >> Mike McCandless >> >> http://blog.mikemccandless.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-9.x-MacOSX (64bit/jdk-18) - Build # 978 - Unstable!
On Wed, Aug 24, 2022 at 11:40 AM Uwe Schindler wrote: > > Hi, > > this is the MacOS virtualbox. This one often hast timeshifts caused by > Virtualbox and the NTP daemon of OSX is bullshit (no chrony). > > Actually earlier versions of MacOS had a bug in their OS libc > segfaulting the app to crash on backwards jumps of wall time, which was > fixed a few years ago. Now it looks like sometimes only Gradle/Java > hangs because of this. Macos and backwards-jumping time do not fit well! > Maybe a reason why Apple does not like their OS virtualized :-) Their > bullshit kernel only works for 100% INTEL CPUs with all hardware > behaving exactly in order to time. > Honestly, some of it is the virtualbox, too. Once you eliminate or workaround wall-clock time and just deal with monotonic time, there can still be annoying issues with just monotonic time. With a linux guest, you'll see strange stuff, such as kernel's softlockup detector trip a lot when this happens. There are corresponding errors printed in the vbox logging too. I set VBOX_RELEASE_LOG_DEST to allow archiving the virtualbox VM log for jenkins pickup along with other logs: it helps with debugging shit like this. For linux guest, I basically exhausted all possible kernel clock sources, and found the kvm-clock virtualized one that happens by default is the best by far. I'm guessing MacOS may not support this, which probably makes things worse there. I found in my environment for linux guests, remaining timer issues can be greatly improved with a 'vboxmanage setextradata VBoxInternal/TM/TSCModeSwitchAllowed 0'. Don't ask me what it does :) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org