srowen commented on a change in pull request #25998: [SPARK-29328][SQL] Fix
calculation of mean seconds per month
URL: https://github.com/apache/spark/pull/25998#discussion_r331546715
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/EventTimeWatermark.scala
##########
@@ -28,9 +27,7 @@ object EventTimeWatermark {
val delayKey = "spark.watermarkDelayMs"
def getDelayMs(delay: CalendarInterval): Long = {
- // We define month as `31 days` to simplify calculation.
- val millisPerMonth =
TimeUnit.MICROSECONDS.toMillis(CalendarInterval.MICROS_PER_DAY) * 31
- delay.milliseconds + delay.months * millisPerMonth
+ delay.milliseconds + delay.months * MILLIS_PER_MONTH
Review comment:
`months_between` is sort of a special case because "31 days per month" is
(it seems) actually how it is supposed to work, correctly.
It's rare that someone would specify "1 month" here, let alone "10 years"
right? or am I missing something? these are things like watermark intervals.
Not that it means the semantics don't matter, it's just quite a corner case.
I therefore just don't feel strongly either way about it. We don't need to
match `months_between` semantics. More precision is nice, but surely it almost
never comes up anyway? I don't mind the change, as a result.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]