Re: [JENKINS] Lucene-9.x-MacOSX (64bit/jdk-18) - Build # 978 - Unstable!

2022-08-24 Thread Dawid Weiss
A test timed out. I've beasted with the same settings but can't
reproduce. Either JVM bug somewhere or cosmic interference...

Dawid

On Wed, Aug 24, 2022 at 3:32 AM Policeman Jenkins Server
 wrote:
>
> Build: https://jenkins.thetaphi.de/job/Lucene-9.x-MacOSX/978/
> Java: 64bit/jdk-18 -XX:+UseCompressedOops -XX:+UseSerialGC
>
> 2 tests failed.
> FAILED:  
> org.apache.lucene.analysis.ko.TestKoreanReadingFormFilter.testRandomData
>
> Error Message:
> java.lang.Exception: Test abandoned because suite timeout was reached.
>
> Stack Trace:
> java.lang.Exception: Test abandoned because suite timeout was reached.
> at __randomizedtesting.SeedInfo.seed([9AA6F3EBA279C5BA]:0)
>
>
> FAILED:  org.apache.lucene.analysis.ko.TestKoreanReadingFormFilter.classMethod
>
> Error Message:
> java.lang.Exception: Suite timeout exceeded (>= 720 msec).
>
> Stack Trace:
> java.lang.Exception: Suite timeout exceeded (>= 720 msec).
> at __randomizedtesting.SeedInfo.seed([9AA6F3EBA279C5BA]:0)
>
> -
> To unsubscribe, e-mail: builds-unsubscr...@lucene.apache.org
> For additional commands, e-mail: builds-h...@lucene.apache.org

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-9.x-MacOSX (64bit/jdk-18) - Build # 978 - Unstable!

2022-08-24 Thread Robert Muir
Hi Dawid, I looked at this and also https://github.com/apache/lucene/issues/7687

If you look at the instances and how sporadic they are, the problem
could be caused by TimeoutSuite using wall-clock time in
com.carrotsearch.randomizedtesting? Especially in virtual machines,
wall-clock time can be extremely inaccurate when you spin them up,
then there's a big correction (via NTP or VM agent).

I have no proof this is what is happening, except to say, I think it
would be better if randomizedtesting used monotonic time (nanoTime)
rather than wall-clock time (currentTimeMillis). It would make it more
robust.


On Wed, Aug 24, 2022 at 4:48 AM Dawid Weiss  wrote:
>
> A test timed out. I've beasted with the same settings but can't
> reproduce. Either JVM bug somewhere or cosmic interference...
>
> Dawid
>
> On Wed, Aug 24, 2022 at 3:32 AM Policeman Jenkins Server
>  wrote:
> >
> > Build: https://jenkins.thetaphi.de/job/Lucene-9.x-MacOSX/978/
> > Java: 64bit/jdk-18 -XX:+UseCompressedOops -XX:+UseSerialGC
> >
> > 2 tests failed.
> > FAILED:  
> > org.apache.lucene.analysis.ko.TestKoreanReadingFormFilter.testRandomData
> >
> > Error Message:
> > java.lang.Exception: Test abandoned because suite timeout was reached.
> >
> > Stack Trace:
> > java.lang.Exception: Test abandoned because suite timeout was reached.
> > at __randomizedtesting.SeedInfo.seed([9AA6F3EBA279C5BA]:0)
> >
> >
> > FAILED:  
> > org.apache.lucene.analysis.ko.TestKoreanReadingFormFilter.classMethod
> >
> > Error Message:
> > java.lang.Exception: Suite timeout exceeded (>= 720 msec).
> >
> > Stack Trace:
> > java.lang.Exception: Suite timeout exceeded (>= 720 msec).
> > at __randomizedtesting.SeedInfo.seed([9AA6F3EBA279C5BA]:0)
> >
> > -
> > To unsubscribe, e-mail: builds-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: builds-h...@lucene.apache.org
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-9.x-MacOSX (64bit/jdk-18) - Build # 978 - Unstable!

2022-08-24 Thread Dawid Weiss
Damn. I know about it but never had it happen to me. You're right in
that it could be a reason and it's definitely one of the aspects I can
take off the checklist. It looks strange because those timeouts are
fairly high - the time correction would indeed have to be significant
for this to fail (and in the middle of the process?!). Anyway, I'll
look into this - thanks for the pointer!

Dawid

On Wed, Aug 24, 2022 at 1:39 PM Robert Muir  wrote:
>
> Hi Dawid, I looked at this and also 
> https://github.com/apache/lucene/issues/7687
>
> If you look at the instances and how sporadic they are, the problem
> could be caused by TimeoutSuite using wall-clock time in
> com.carrotsearch.randomizedtesting? Especially in virtual machines,
> wall-clock time can be extremely inaccurate when you spin them up,
> then there's a big correction (via NTP or VM agent).
>
> I have no proof this is what is happening, except to say, I think it
> would be better if randomizedtesting used monotonic time (nanoTime)
> rather than wall-clock time (currentTimeMillis). It would make it more
> robust.
>
>
> On Wed, Aug 24, 2022 at 4:48 AM Dawid Weiss  wrote:
> >
> > A test timed out. I've beasted with the same settings but can't
> > reproduce. Either JVM bug somewhere or cosmic interference...
> >
> > Dawid
> >
> > On Wed, Aug 24, 2022 at 3:32 AM Policeman Jenkins Server
> >  wrote:
> > >
> > > Build: https://jenkins.thetaphi.de/job/Lucene-9.x-MacOSX/978/
> > > Java: 64bit/jdk-18 -XX:+UseCompressedOops -XX:+UseSerialGC
> > >
> > > 2 tests failed.
> > > FAILED:  
> > > org.apache.lucene.analysis.ko.TestKoreanReadingFormFilter.testRandomData
> > >
> > > Error Message:
> > > java.lang.Exception: Test abandoned because suite timeout was reached.
> > >
> > > Stack Trace:
> > > java.lang.Exception: Test abandoned because suite timeout was reached.
> > > at __randomizedtesting.SeedInfo.seed([9AA6F3EBA279C5BA]:0)
> > >
> > >
> > > FAILED:  
> > > org.apache.lucene.analysis.ko.TestKoreanReadingFormFilter.classMethod
> > >
> > > Error Message:
> > > java.lang.Exception: Suite timeout exceeded (>= 720 msec).
> > >
> > > Stack Trace:
> > > java.lang.Exception: Suite timeout exceeded (>= 720 msec).
> > > at __randomizedtesting.SeedInfo.seed([9AA6F3EBA279C5BA]:0)
> > >
> > > -
> > > To unsubscribe, e-mail: builds-unsubscr...@lucene.apache.org
> > > For additional commands, e-mail: builds-h...@lucene.apache.org
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: dev-h...@lucene.apache.org
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-9.x-MacOSX (64bit/jdk-18) - Build # 978 - Unstable!

2022-08-24 Thread Robert Muir
If we look at the 7687 issue, there's definitely some that can be
explained by unruly tests randomly behaving badly. But a few of those
(such as simple stemmer tests) look suspicious to me.
I've fought the issue with my own tests (non-java) and its amazing how
much stuff can break, if it relies on wall-clock time and the clock
gets stepped. I'm talking about basic 20-year old mature C code too :)
It is also surprising how large these clock corrections can be with
virtual machines.

To really confirm it, we'd need "system logs" as well to correlate the
NTP activity with the failure. With virtualbox jenkins builds, I do
this by enabling a serial console to file, and configure syslog to log
to /dev/console. And this "system log file" is just another artifact
that jenkins saves away for debugging. That's how i found the problem
in my own tests.

On Wed, Aug 24, 2022 at 9:08 AM Dawid Weiss  wrote:
>
> Damn. I know about it but never had it happen to me. You're right in
> that it could be a reason and it's definitely one of the aspects I can
> take off the checklist. It looks strange because those timeouts are
> fairly high - the time correction would indeed have to be significant
> for this to fail (and in the middle of the process?!). Anyway, I'll
> look into this - thanks for the pointer!
>
> Dawid
>
> On Wed, Aug 24, 2022 at 1:39 PM Robert Muir  wrote:
> >
> > Hi Dawid, I looked at this and also 
> > https://github.com/apache/lucene/issues/7687
> >
> > If you look at the instances and how sporadic they are, the problem
> > could be caused by TimeoutSuite using wall-clock time in
> > com.carrotsearch.randomizedtesting? Especially in virtual machines,
> > wall-clock time can be extremely inaccurate when you spin them up,
> > then there's a big correction (via NTP or VM agent).
> >
> > I have no proof this is what is happening, except to say, I think it
> > would be better if randomizedtesting used monotonic time (nanoTime)
> > rather than wall-clock time (currentTimeMillis). It would make it more
> > robust.
> >
> >
> > On Wed, Aug 24, 2022 at 4:48 AM Dawid Weiss  wrote:
> > >
> > > A test timed out. I've beasted with the same settings but can't
> > > reproduce. Either JVM bug somewhere or cosmic interference...
> > >
> > > Dawid
> > >
> > > On Wed, Aug 24, 2022 at 3:32 AM Policeman Jenkins Server
> > >  wrote:
> > > >
> > > > Build: https://jenkins.thetaphi.de/job/Lucene-9.x-MacOSX/978/
> > > > Java: 64bit/jdk-18 -XX:+UseCompressedOops -XX:+UseSerialGC
> > > >
> > > > 2 tests failed.
> > > > FAILED:  
> > > > org.apache.lucene.analysis.ko.TestKoreanReadingFormFilter.testRandomData
> > > >
> > > > Error Message:
> > > > java.lang.Exception: Test abandoned because suite timeout was reached.
> > > >
> > > > Stack Trace:
> > > > java.lang.Exception: Test abandoned because suite timeout was reached.
> > > > at __randomizedtesting.SeedInfo.seed([9AA6F3EBA279C5BA]:0)
> > > >
> > > >
> > > > FAILED:  
> > > > org.apache.lucene.analysis.ko.TestKoreanReadingFormFilter.classMethod
> > > >
> > > > Error Message:
> > > > java.lang.Exception: Suite timeout exceeded (>= 720 msec).
> > > >
> > > > Stack Trace:
> > > > java.lang.Exception: Suite timeout exceeded (>= 720 msec).
> > > > at __randomizedtesting.SeedInfo.seed([9AA6F3EBA279C5BA]:0)
> > > >
> > > > -
> > > > To unsubscribe, e-mail: builds-unsubscr...@lucene.apache.org
> > > > For additional commands, e-mail: builds-h...@lucene.apache.org
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> > > For additional commands, e-mail: dev-h...@lucene.apache.org
> > >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: dev-h...@lucene.apache.org
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-9.x-MacOSX (64bit/jdk-18) - Build # 978 - Unstable!

2022-08-24 Thread Uwe Schindler

Hi,

this is the MacOS virtualbox. This one often hast timeshifts caused by 
Virtualbox and the NTP daemon of OSX is bullshit (no chrony).


Actually earlier versions of MacOS had a bug in their OS libc 
segfaulting the app to crash on backwards jumps of wall time, which was 
fixed a few years ago. Now it looks like sometimes only Gradle/Java 
hangs because of this. Macos and backwards-jumping time do not fit well! 
Maybe a reason why Apple does not like their OS virtualized :-) Their 
bullshit kernel only works for 100% INTEL CPUs with all hardware 
behaving exactly in order to time.


Uwe

Am 24.08.2022 um 15:07 schrieb Dawid Weiss:

Damn. I know about it but never had it happen to me. You're right in
that it could be a reason and it's definitely one of the aspects I can
take off the checklist. It looks strange because those timeouts are
fairly high - the time correction would indeed have to be significant
for this to fail (and in the middle of the process?!). Anyway, I'll
look into this - thanks for the pointer!

Dawid

On Wed, Aug 24, 2022 at 1:39 PM Robert Muir  wrote:

Hi Dawid, I looked at this and also https://github.com/apache/lucene/issues/7687

If you look at the instances and how sporadic they are, the problem
could be caused by TimeoutSuite using wall-clock time in
com.carrotsearch.randomizedtesting? Especially in virtual machines,
wall-clock time can be extremely inaccurate when you spin them up,
then there's a big correction (via NTP or VM agent).

I have no proof this is what is happening, except to say, I think it
would be better if randomizedtesting used monotonic time (nanoTime)
rather than wall-clock time (currentTimeMillis). It would make it more
robust.


On Wed, Aug 24, 2022 at 4:48 AM Dawid Weiss  wrote:

A test timed out. I've beasted with the same settings but can't
reproduce. Either JVM bug somewhere or cosmic interference...

Dawid

On Wed, Aug 24, 2022 at 3:32 AM Policeman Jenkins Server
 wrote:

Build: https://jenkins.thetaphi.de/job/Lucene-9.x-MacOSX/978/
Java: 64bit/jdk-18 -XX:+UseCompressedOops -XX:+UseSerialGC

2 tests failed.
FAILED:  
org.apache.lucene.analysis.ko.TestKoreanReadingFormFilter.testRandomData

Error Message:
java.lang.Exception: Test abandoned because suite timeout was reached.

Stack Trace:
java.lang.Exception: Test abandoned because suite timeout was reached.
 at __randomizedtesting.SeedInfo.seed([9AA6F3EBA279C5BA]:0)


FAILED:  org.apache.lucene.analysis.ko.TestKoreanReadingFormFilter.classMethod

Error Message:
java.lang.Exception: Suite timeout exceeded (>= 720 msec).

Stack Trace:
java.lang.Exception: Suite timeout exceeded (>= 720 msec).
 at __randomizedtesting.SeedInfo.seed([9AA6F3EBA279C5BA]:0)

-
To unsubscribe, e-mail: builds-unsubscr...@lucene.apache.org
For additional commands, e-mail: builds-h...@lucene.apache.org

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org


--
Uwe Schindler
Achterdiek 19, D-28357 Bremen
https://www.thetaphi.de
eMail: u...@thetaphi.de


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [ANNOUNCE] Issue migration Jira to GitHub starts on Monday, August 22

2022-08-24 Thread Tomoko Uchida


Issue migration has been completed (except for minor cleanups).
This is the Jira -> GitHub issue number mapping for possible future usage.
https://github.com/apache/lucene-jira-archive/blob/main/migration/mappings-data/issue-map.csv.20220823_final

GitHub issue is now fully available for all issues.
For issue label management (e.g. "fix-version"), please review this manual.
https://github.com/apache/lucene/blob/main/dev-docs/github-issues-howto.md

Tomoko


2022年8月22日(月) 19:46 Michael McCandless :

> Wooot!  Thank you so much Tomoko!!
>
> Mike
>
> On Mon, Aug 22, 2022 at 6:44 AM Tomoko Uchida <
> tomoko.uchida.1...@gmail.com> wrote:
>
>> 
>>
>> Issue migration has been started. Jira is now read-only.
>>
>> GitHub issue is available for new issues.
>>
>> - You should open new issues on GitHub. E.g.
>> https://github.com/apache/lucene/issues/1078
>> - Do not touch issues that are in the middle of migration, please. E.g.
>> https://github.com/apache/lucene/issues/1072
>>   - While you cannot break these issues, migration scripts can
>> modify/overwrite your comments on the issues.
>> - Pull requests are not affected. You can open/update PRs as usual.
>> Please let me know if you have any trouble with PRs.
>>
>>
>> Tomoko
>>
>>
>> 2022年8月18日(木) 18:23 Tomoko Uchida :
>>
>>> Hello all,
>>>
>>> The Lucene project decided to move our issue tracking system from Jira
>>> to GitHub and migrate all Jira issues to GitHub.
>>>
>>> We start issue migration on Monday, August 22 at 8:00 UTC.
>>> 1) We make Jira read-only before migration. You cannot update existing
>>> issues until the migration is completed.
>>> 2) You can use GitHub for opening NEW issues or pull requests during
>>> migration.
>>>
>>> Note that issues should be raised in Jira at this moment, although
>>> GitHub issue is already enabled in the Lucene repository.
>>> Please do not raise issues in GitHub until we let you know that GitHub
>>> issue is officially available. We immediately close any issues on GitHub
>>> until then.
>>>
>>> Here are the detailed plan/migration steps.
>>> https://github.com/apache/lucene-jira-archive/issues/7
>>>
>>> Tomoko
>>>
>> --
> Mike McCandless
>
> http://blog.mikemccandless.com
>


Re: [ANNOUNCE] Issue migration Jira to GitHub starts on Monday, August 22

2022-08-24 Thread Michael Sokolov
Thanks! It seems to be working nicely.

Question about the fix-version: tagging. I wonder if going forward we
want to main that for new issues? I happened to notice there is also
this "milestone" feature in github -- does that seem like a place to
put version information?

On Wed, Aug 24, 2022 at 3:20 PM Tomoko Uchida
 wrote:
>
> 
>
> Issue migration has been completed (except for minor cleanups).
> This is the Jira -> GitHub issue number mapping for possible future usage. 
> https://github.com/apache/lucene-jira-archive/blob/main/migration/mappings-data/issue-map.csv.20220823_final
>
> GitHub issue is now fully available for all issues.
> For issue label management (e.g. "fix-version"), please review this manual.
> https://github.com/apache/lucene/blob/main/dev-docs/github-issues-howto.md
>
> Tomoko
>
>
> 2022年8月22日(月) 19:46 Michael McCandless :
>>
>> Wooot!  Thank you so much Tomoko!!
>>
>> Mike
>>
>> On Mon, Aug 22, 2022 at 6:44 AM Tomoko Uchida  
>> wrote:
>>>
>>> 
>>>
>>> Issue migration has been started. Jira is now read-only.
>>>
>>> GitHub issue is available for new issues.
>>>
>>> - You should open new issues on GitHub. E.g. 
>>> https://github.com/apache/lucene/issues/1078
>>> - Do not touch issues that are in the middle of migration, please. E.g. 
>>> https://github.com/apache/lucene/issues/1072
>>>   - While you cannot break these issues, migration scripts can 
>>> modify/overwrite your comments on the issues.
>>> - Pull requests are not affected. You can open/update PRs as usual. Please 
>>> let me know if you have any trouble with PRs.
>>>
>>>
>>> Tomoko
>>>
>>>
>>> 2022年8月18日(木) 18:23 Tomoko Uchida :

 Hello all,

 The Lucene project decided to move our issue tracking system from Jira to 
 GitHub and migrate all Jira issues to GitHub.

 We start issue migration on Monday, August 22 at 8:00 UTC.
 1) We make Jira read-only before migration. You cannot update existing 
 issues until the migration is completed.
 2) You can use GitHub for opening NEW issues or pull requests during 
 migration.

 Note that issues should be raised in Jira at this moment, although GitHub 
 issue is already enabled in the Lucene repository.
 Please do not raise issues in GitHub until we let you know that GitHub 
 issue is officially available. We immediately close any issues on GitHub 
 until then.

 Here are the detailed plan/migration steps.
 https://github.com/apache/lucene-jira-archive/issues/7

 Tomoko
>>
>> --
>> Mike McCandless
>>
>> http://blog.mikemccandless.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-9.x-MacOSX (64bit/jdk-18) - Build # 978 - Unstable!

2022-08-24 Thread Robert Muir
On Wed, Aug 24, 2022 at 11:40 AM Uwe Schindler  wrote:
>
> Hi,
>
> this is the MacOS virtualbox. This one often hast timeshifts caused by
> Virtualbox and the NTP daemon of OSX is bullshit (no chrony).
>
> Actually earlier versions of MacOS had a bug in their OS libc
> segfaulting the app to crash on backwards jumps of wall time, which was
> fixed a few years ago. Now it looks like sometimes only Gradle/Java
> hangs because of this. Macos and backwards-jumping time do not fit well!
> Maybe a reason why Apple does not like their OS virtualized :-) Their
> bullshit kernel only works for 100% INTEL CPUs with all hardware
> behaving exactly in order to time.
>

Honestly, some of it is the virtualbox, too. Once you eliminate or
workaround wall-clock time and just deal with monotonic time, there
can still be annoying issues with just monotonic time. With a linux
guest, you'll see strange stuff, such as kernel's softlockup detector
trip a lot when this happens. There are corresponding errors printed
in the vbox logging too. I set VBOX_RELEASE_LOG_DEST to allow
archiving the virtualbox VM log for jenkins pickup along with other
logs: it helps with debugging shit like this. For linux guest, I
basically exhausted all possible kernel clock sources, and found the
kvm-clock virtualized one that happens by default is the best by far.
I'm guessing MacOS may not support this, which probably makes things
worse there. I found in my environment for linux guests, remaining
timer issues can be greatly improved with a 'vboxmanage setextradata
 VBoxInternal/TM/TSCModeSwitchAllowed 0'. Don't ask me what it
does :)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org