paulirwin commented on issue #846: URL: https://github.com/apache/lucenenet/issues/846#issuecomment-2566110261
I am happy to report that I have definitively found the source of the failures from this original issue. It occurs only for the zh-Hant-TW culture, for any time zones with a negative offset, and only on .NET 6-8 on Linux and macOS. The root cause is that that culture's CultureInfo is missing the `tt` AM/PM designator from the format strings in `DateTimeFormat.LongTimePattern` (as well as `FullDateTimePattern`, but we aren't using that). So the following code: ```c# using System.Globalization; var ci = CultureInfo.GetCultureInfo("zh-Hant-TW"); Console.WriteLine(ci.DateTimeFormat.LongDatePattern + " " + ci.DateTimeFormat.LongTimePattern); Console.WriteLine(new DateTime(1969, 12, 31, 20, 0, 0).ToString(ci.DateTimeFormat.LongDatePattern + " " + ci.DateTimeFormat.LongTimePattern, ci)); ``` ... results in the following on .NET 6-8 on Linux and macOS: ``` yyyy年M月d日 dddd h:mm:ss 1969年12月31日 星期三 8:00:00 ``` ... whereas it results in the following on .NET 9, because it includes the `tt` format before the hour: ``` yyyy年M月d日 dddd tth:mm:ss 1969年12月31日 星期三 下午8:00:00 ``` Without the 下午, this gets parsed as 8:00 am on the day before the unix epoch (in this case, at a -4:00 offset time zone), rather than 8:00 pm, which when adjusted for the timezone offset is the unix epoch exactly. Because the time is 12 hours before the epoch, the documents do not match the date queries, and the expected number of results is not returned, thus the test assertion fails. This is only a problem with negative offsets, because with zero or positive offsets, it is a number on or after midnight which will correctly get parsed as AM without the designator. I wrote a small program to go through and verify all cultures to see if any others were a problem like this, and it seems to only be zh-Hant-TW, and only net6.0-net8.0. The .NET team seems to have fixed this (possibly unintentionally by upgrading ICU) in .NET 9. I am going to fix this by adding another form of "sanity" check for the culture/time zone combinations that ensures that the unix epoch can round-trip ToString/Parse with the given format string. If it fails, then it'll iterate again and find a new random culture/time zone that works. Additionally, I found another failure through many repeated random runs, that had not been reported yet. For cultures that use a decimal comma, such as sv-FI, small decimal values can fail due to a J2N round-trip formatting/parsing bug when there is a decimal comma and exponential notation. That has been filed as https://github.com/NightOwl888/J2N/issues/128. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@lucenenet.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org