[ 
https://issues.apache.org/jira/browse/SPARK-23761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-23761.
----------------------------------
    Resolution: Cannot Reproduce

> Dataframe filter(udf) followed by groupby in pyspark throws a casting error
> ---------------------------------------------------------------------------
>
>                 Key: SPARK-23761
>                 URL: https://issues.apache.org/jira/browse/SPARK-23761
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark, SQL
>    Affects Versions: 1.6.0
>         Environment: pyspark 1.6.0
> Python 2.6.6 (r266:84292, Aug 18 2016, 15:13:37) 
> [GCC 4.4.7 20120313 (Red Hat 4.4.7-17)] on linux2
> CentOS 6.7
>            Reporter: Dhaniram Kshirsagar
>            Priority: Major
>
> On pyspark with dataframe, we are getting following exception when 
> 'filter(with UDF) is followed by groupby' :-
> # Snippet of error observed in pyspark
> {code:java}
> py4j.protocol.Py4JJavaError: An error occurred while calling o56.filter.
> : java.lang.ClassCastException: 
> org.apache.spark.sql.catalyst.plans.logical.Project cannot be cast to 
> org.apache.spark.sql.catalyst.plans.logical.Aggregate{code}
> This one looks like https://issues.apache.org/jira/browse/SPARK-12981 however 
> not sure if this one is same.
>  
> Here is gist with pyspark steps to reproduce this issue:
> [https://gist.github.com/dhaniram-kshirsagar/d72545620b6a05d145a1a6bece797b6d]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to