Burak Yavuz commented on SPARK-21590:

Ah, I apologize, I thought the other way around (you had data in UTC but wanted 
to report in CST). Would you like to submit a PR for this then?

I think for the calculations to still hold these are the requirements:
 1. The absolute value of the start offset is less than the slide interval
 2. If a start offset is negative, we add the slide interval to make it positive

In the end this'll open up the APIs to accept negative values. The calculation 
shouldn't change (it's pretty brittle, had to think about it for a week to 
conclude that there is no magic formula that gives the exact windows)

> Structured Streaming window start time should support negative values to 
> adjust time zone
> -----------------------------------------------------------------------------------------
>                 Key: SPARK-21590
>                 URL: https://issues.apache.org/jira/browse/SPARK-21590
>             Project: Spark
>          Issue Type: Bug
>          Components: Structured Streaming
>    Affects Versions: 2.0.0, 2.0.1, 2.1.0, 2.2.0
>         Environment: spark 2.2.0
>            Reporter: Kevin Zhang
>              Labels: spark-sql, spark2.2, streaming, structured, timezone, 
> window
> I want to calculate (unique) daily access count using structured streaming 
> (2.2.0). 
> Now strut streaming' s window with 1 day duration starts at 
> 00:00:00 UTC and ends at 23:59:59 UTC each day, but my local timezone is CST 
> (UTC + 8 hours) and I
> want date boundaries to be 00:00:00 CST (that is 00:00:00 UTC - 8). 
> In Flink I can set the window offset to -8 hours to make it, but here in 
> struct streaming if I set the start time (same as the offset in Flink) to -8 
> or any other negative values, I will get the following error:
> {code:java}
> Exception in thread "main" org.apache.spark.sql.AnalysisException: cannot 
> resolve 'timewindow(timestamp, 86400000000, 86400000000, -28800000000)' due 
> to data type mismatch: The start time (-28800000000) must be greater than or 
> equal to 0.;;
> {code}
> because the time window checks the input parameters to guarantee each value 
> is greater than or equal to 0.
> So I'm thinking about whether we can remove the limit that the start time 
> cannot be negative?

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to