Maciej Szymkiewicz created SPARK-18690:
------------------------------------------

             Summary: Backward compatibility of unbounded frames
                 Key: SPARK-18690
                 URL: https://issues.apache.org/jira/browse/SPARK-18690
             Project: Spark
          Issue Type: Improvement
          Components: PySpark, SQL
    Affects Versions: 2.1.0
            Reporter: Maciej Szymkiewicz
            Priority: Minor


SPARK-17845 introduced constant values to mark unbounded frame. This can break 
backward compatibility on some systems:

In Spark <= 2.0:
-  {{UNBOUNDED PRECEDING}} is {{-sys.maxisze}}
-  {{UNBOUNDED FOLLOWING}} is {{sys.maxisze}}

On 64 bit systems {{-sys.maxisze}} is typically equal to  (1 << 63) - 1, on 32 
bit systems (1 << 31) - 1 
(https://docs.python.org/3/library/sys.html#sys.maxsize).

After SPARK-17845 this values are

-  {{UNBOUNDED PRECEDING}} is -(1 << 63)
-  {{UNBOUNDED FOLLOWING}} is (1 << 63) - 1

As a result on many systems current code won't no longer use UNBOUNDED 
PRECEDING frame.

We can use following values to ensure backward compatibility:

- {{UNBOUNDED PRECEDING}} =  {{max(-sys.maxsize, _JAVA_MIN_LONG)}}
- {{UNBOUNDED FOLLOWING}} =  {{min(sys.maxsize, _JAVA_MAX_LONG)}}

Pros:
- Prevents hard to spot errors in the user code.

Cons:
- Unnecessary complicated rules in the Spark code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to