GitHub user jiangxb1987 opened a pull request:

    https://github.com/apache/spark/pull/22853

    [SPARK-25845][SQL] Fix MatchError for calendar interval type in range frame 
left boundary

    ## What changes were proposed in this pull request?
    
    WindowSpecDefinition checks start < last, but CalendarIntervalType is not 
comparable, so it would throw the following exception at runtime:
    
    ```
     scala.MatchError: CalendarIntervalType (of class 
org.apache.spark.sql.types.CalendarIntervalType$)      at 
     
org.apache.spark.sql.catalyst.util.TypeUtils$.getInterpretedOrdering(TypeUtils.scala:58)
 at 
     
org.apache.spark.sql.catalyst.expressions.BinaryComparison.ordering$lzycompute(predicates.scala:592)
 at 
     
org.apache.spark.sql.catalyst.expressions.BinaryComparison.ordering(predicates.scala:592)
 at 
     
org.apache.spark.sql.catalyst.expressions.GreaterThan.nullSafeEval(predicates.scala:797)
 at 
org.apache.spark.sql.catalyst.expressions.BinaryExpression.eval(Expression.scala:496)
 at 
org.apache.spark.sql.catalyst.expressions.SpecifiedWindowFrame.isGreaterThan(windowExpressions.scala:245)
 at 
     
org.apache.spark.sql.catalyst.expressions.SpecifiedWindowFrame.checkInputDataTypes(windowExpressions.scala:216)
 at 
     
org.apache.spark.sql.catalyst.expressions.Expression.resolved$lzycompute(Expression.scala:171)
 at 
     
org.apache.spark.sql.catalyst.expressions.Expression.resolved(Expression.scala:171)
 at 
     
org.apache.spark.sql.catalyst.expressions.Expression$$anonfun$childrenResolved$1.apply(Expression.scala:183)
 at 
     
org.apache.spark.sql.catalyst.expressions.Expression$$anonfun$childrenResolved$1.apply(Expression.scala:183)
 at 
     
scala.collection.IndexedSeqOptimized$class.prefixLengthImpl(IndexedSeqOptimized.scala:38)
 at 
scala.collection.IndexedSeqOptimized$class.forall(IndexedSeqOptimized.scala:43) 
at scala.collection.mutable.ArrayBuffer.forall(ArrayBuffer.scala:48) at 
     
org.apache.spark.sql.catalyst.expressions.Expression.childrenResolved(Expression.scala:183)
 at 
     
org.apache.spark.sql.catalyst.expressions.WindowSpecDefinition.resolved$lzycompute(windowExpressions.scala:48)
 at 
     
org.apache.spark.sql.catalyst.expressions.WindowSpecDefinition.resolved(windowExpressions.scala:48)
 at 
     
org.apache.spark.sql.catalyst.expressions.Expression$$anonfun$childrenResolved$1.apply(Expression.scala:183)
 at 
     
org.apache.spark.sql.catalyst.expressions.Expression$$anonfun$childrenResolved$1.apply(Expression.scala:183)
 at 
     
scala.collection.LinearSeqOptimized$class.forall(LinearSeqOptimized.scala:83)   
 
    ```
    
    We fix the issue by only perform the check on boundary expressions that are 
AtomicType.
    
    ## How was this patch tested?
    
    Add new test case in `DataFrameWindowFramesSuite`

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/jiangxb1987/spark windowBoundary

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22853.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22853
    
----
commit 9d2a1b27caefb6b61c767d7971782b9a74e5d199
Author: Xingbo Jiang <xingbo.jiang@...>
Date:   2018-10-26T15:41:32Z

    fix CalendarIntervalType window boundary failure

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to