Github user huaxingao commented on a diff in the pull request:
https://github.com/apache/spark/pull/20400#discussion_r164261850
--- Diff: python/pyspark/sql/window.py ---
@@ -124,16 +124,19 @@ def rangeBetween(start, end):
values directly.
:param start: boundary start, inclusive.
- The frame is unbounded if this is
``Window.unboundedPreceding``, or
+ The frame is unbounded if this is
``Window.unboundedPreceding``,
+
``org.apache.spark.sql.catalyst.expressions.UnboundedPreceding``, or
any value less than or equal to max(-sys.maxsize,
-9223372036854775808).
:param end: boundary end, inclusive.
- The frame is unbounded if this is
``Window.unboundedFollowing``, or
+ The frame is unbounded if this is
``Window.unboundedFollowing``,
+
``org.apache.spark.sql.catalyst.expressions.UnboundedPFollowing``, or
any value greater than or equal to min(sys.maxsize,
9223372036854775807).
"""
- if start <= Window._PRECEDING_THRESHOLD:
- start = Window.unboundedPreceding
- if end >= Window._FOLLOWING_THRESHOLD:
- end = Window.unboundedFollowing
+ if isinstance(start, int) and isinstance(end, int):
+ if start <= Window._PRECEDING_THRESHOLD:
+ start = Window.unboundedPreceding
--- End diff --
@jiangxb1987
Do you mean to change to
```
if isinstance(start, int) and isinstance(end, int):
if start == Window._PRECEDING_THRESHOLD:
# Window._PRECEDING_THRESHOLD == Long.MinValue
start = Window.unboundedPreceding
if end == Window._FOLLOWING_THRESHOLD:
# Window._FOLLOWING_THRESHOLD == Long.MaxValue
end = Window.unboundedFollowing
```
I ran python tests, tests.py failed at
```
with patch("sys.maxsize", 2 ** 127 - 1):
importlib.reload(window)
self.assertTrue(rows_frame_match())
self.assertTrue(range_frame_match())
```
So I guess I will keep
```if start <= Window._PRECEDING_THRESHOLD```
and
``` if end >= Window._FOLLOWING_THRESHOLD```
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]