Can you submit a pull request with test cases based on that change?
On Dec 1, 2016, 9:39 AM -0800, Maciej Szymkiewicz <mszymkiew...@gmail.com>, wrote: > This doesn't affect that. The only concern is what we consider to UNBOUNDED > on Python side. > > On 12/01/2016 07:56 AM, assaf.mendelson wrote: > > I may be mistaken but if I remember correctly spark behaves differently > > when it is bounded in the past and when it is not. Specifically I seem to > > recall a fix which made sure that when there is no lower bound then the > > aggregation is done one by one instead of doing the whole range for each > > window. So I believe it should be configured exactly the same as in > > scala/java so the optimization would take place. > > Assaf. > > > > From: rxin [via Apache Spark Developers List] [mailto:ml-node+[hidden > > email]] > > Sent: Wednesday, November 30, 2016 8:35 PM > > To: Mendelson, Assaf > > Subject: Re: [SPARK-17845] [SQL][PYTHON] More self-evident window function > > frame boundary API > > > > Yes I'd define unboundedPreceding to -sys.maxsize, but also any value less > > than min(-sys.maxsize, _JAVA_MIN_LONG) are considered unboundedPreceding > > too. We need to be careful with long overflow when transferring data over > > to Java. > > > > > > On Wed, Nov 30, 2016 at 10:04 AM, Maciej Szymkiewicz <[hidden email]> wrote: > > It is platform specific so theoretically can be larger, but 2**63 - 1 is a > > standard on 64 bit platform and 2**31 - 1 on 32bit platform. I can submit a > > patch but I am not sure how to proceed. Personally I would set > > > > unboundedPreceding = -sys.maxsize > > > > unboundedFollowing = sys.maxsize > > to keep backwards compatibility. > > On 11/30/2016 06:52 PM, Reynold Xin wrote: > > > Ah ok for some reason when I did the pull request sys.maxsize was much > > > larger than 2^63. Do you want to submit a patch to fix this? > > > > > > > > > On Wed, Nov 30, 2016 at 9:48 AM, Maciej Szymkiewicz <[hidden email]> > > > wrote: > > > The problem is that -(1 << 63) is -(sys.maxsize + 1) so the code which > > > used to work before is off by one. > > > On 11/30/2016 06:43 PM, Reynold Xin wrote: > > > > Can you give a repro? Anything less than -(1 << 63) is considered > > > > negative infinity (i.e. unbounded preceding). > > > > > > > > On Wed, Nov 30, 2016 at 8:27 AM, Maciej Szymkiewicz <[hidden email]> > > > > wrote: > > > > Hi, > > > > > > > > I've been looking at the SPARK-17845 and I am curious if there is any > > > > reason to make it a breaking change. In Spark 2.0 and below we could > > > > use: > > > > > > > > Window().partitionBy("foo").orderBy("bar").rowsBetween(-sys.maxsize, > > > > sys.maxsize)) > > > > > > > > In 2.1.0 this code will silently produce incorrect results (ROWS BETWEEN > > > > -1 PRECEDING AND UNBOUNDED FOLLOWING) Couldn't we use > > > > Window.unboundedPreceding equal -sys.maxsize to ensure backward > > > > compatibility? > > > > > > > > -- > > > > > > > > Maciej Szymkiewicz > > > > > > > > > > > > --------------------------------------------------------------------- > > > > To unsubscribe e-mail: [hidden email] > > > > > > > > > > > > > -- > > > > > > Maciej Szymkiewicz > > > > > > > > > -- > > > > Maciej Szymkiewicz > > > > > > If you reply to this email, your message will be added to the discussion > > below: > > http://apache-spark-developers-list.1001551.n3.nabble.com/SPARK-17845-SQL-PYTHON-More-self-evident-window-function-frame-boundary-API-tp20064p20069.html > > To start a new topic under Apache Spark Developers List, email [hidden > > email] > > To unsubscribe from Apache Spark Developers List, click here. > > NAML > > > > View this message in context: RE: [SPARK-17845] [SQL][PYTHON] More > > self-evident window function frame boundary API > > Sent from the Apache Spark Developers List mailing list archive at > > Nabble.com. > > > -- > Maciej Szymkiewicz