Github user attilapiros commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23000#discussion_r234992152
  
    --- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala
 ---
    @@ -410,6 +410,30 @@ class DateTimeUtilsSuite extends SparkFunSuite {
         assert(getDayInYear(getInUTCDays(c.getTimeInMillis)) === 78)
       }
     
    +  test("SPARK-26002: correct day of year calculations for Julian calendar 
years") {
    +    TimeZone.setDefault(TimeZoneUTC)
    +    val c = Calendar.getInstance(TimeZoneUTC)
    +    c.set(Calendar.MILLISECOND, 0)
    +    (1000 to 1600 by 100).foreach { year =>
    +      // January 1 is the 1st day of year.
    +      c.set(year, 0, 1, 0, 0, 0)
    +      assert(getYear(getInUTCDays(c.getTimeInMillis)) === year)
    +      assert(getMonth(getInUTCDays(c.getTimeInMillis)) === 1)
    +      assert(getDayInYear(getInUTCDays(c.getTimeInMillis)) === 1)
    +
    +      // March 1 is the 61st day of the year as they are leap years. It is 
true for
    +      // even the multiples of 100 as before 1582-10-4 the Julian calendar 
leap year calculation
    +      // is used in which every multiples of 4 are leap years
    +      c.set(year, 2, 1, 0, 0, 0)
    +      assert(getDayInYear(getInUTCDays(c.getTimeInMillis)) === 61)
    +      assert(getMonth(getInUTCDays(c.getTimeInMillis)) === 3)
    +
    +      // For non-leap years:
    +      c.set(year + 1, 2, 1, 0, 0, 0)
    +      assert(getDayInYear(getInUTCDays(c.getTimeInMillis)) === 60)
    +    }
    --- End diff --
    
    The last two (1600-01-01 and 1600-03-01) are already tested as 1600 is 
included in `(1000 to 1600 by 100)`.
    
    I have added a new check for 1582-10-03. 
    
    But I would not add an assert for 1582-10-14 without knowing that is really 
the correct value.
    
    I have checked PostgreSQL but there this 10 days gap is not handled at all: 
from `SELECT EXTRACT(DOY FROM TIMESTAMP '1582-10-03 00:00:00');`  to `SELECT 
EXTRACT(DOY FROM TIMESTAMP '1582-10-16 00:00:00');` from day by day it is 
consecutive days from 276 to 289.
      


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to