Maciej Szymkiewicz created SPARK-18690:
------------------------------------------
Summary: Backward compatibility of unbounded frames
Key: SPARK-18690
URL: https://issues.apache.org/jira/browse/SPARK-18690
Project: Spark
Issue Type: Improvement
Components: PySpark, SQL
Affects Versions: 2.1.0
Reporter: Maciej Szymkiewicz
Priority: Minor
SPARK-17845 introduced constant values to mark unbounded frame. This can break
backward compatibility on some systems:
In Spark <= 2.0:
- {{UNBOUNDED PRECEDING}} is {{-sys.maxisze}}
- {{UNBOUNDED FOLLOWING}} is {{sys.maxisze}}
On 64 bit systems {{-sys.maxisze}} is typically equal to (1 << 63) - 1, on 32
bit systems (1 << 31) - 1
(https://docs.python.org/3/library/sys.html#sys.maxsize).
After SPARK-17845 this values are
- {{UNBOUNDED PRECEDING}} is -(1 << 63)
- {{UNBOUNDED FOLLOWING}} is (1 << 63) - 1
As a result on many systems current code won't no longer use UNBOUNDED
PRECEDING frame.
We can use following values to ensure backward compatibility:
- {{UNBOUNDED PRECEDING}} = {{max(-sys.maxsize, _JAVA_MIN_LONG)}}
- {{UNBOUNDED FOLLOWING}} = {{min(sys.maxsize, _JAVA_MAX_LONG)}}
Pros:
- Prevents hard to spot errors in the user code.
Cons:
- Unnecessary complicated rules in the Spark code.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]