Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/16449
  
    It sounds like our watermarkTime delay calculation causes this issue. Below 
are two typical cases:
    
    Case 1: when setting the watermark delay to 1 month interval:
    ```Scala
          .withWatermark("eventTime", "1 months")
    ```
    the `watermarket` time is `Thu Dec 01 20:05:34 PST 2016` and the current 
time is `Sun Jan 01 20:05:34 PST 2017`
    
    Case 2: when setting the watermark delay to 1 month interval:
    ```Scala
          .withWatermark("eventTime", "29 months")
    ```
    the `watermarket` time is `Thu Jul 17 21:10:50 PDT 2014` and the current 
time is `Sun Jan 01 20:10:50 PST 2017`
    
    It sounds like it is caused by our intentional over-estimation (that is, by 
[using 31 days per 
month](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/EventTimeWatermarkExec.scala#L88))?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to