[ 
https://issues.apache.org/jira/browse/SPARK-23761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16420339#comment-16420339
 ] 

Dhaniram Kshirsagar commented on SPARK-23761:
---------------------------------------------

Sure, will try it with latest version of pyspark and let you know. In the mean 
while, is it possible for you to let us know possibility of back-porting those 
fixes to pyspark 1.6 [the version we have].

> Dataframe filter(udf) followed by groupby in pyspark throws a casting error
> ---------------------------------------------------------------------------
>
>                 Key: SPARK-23761
>                 URL: https://issues.apache.org/jira/browse/SPARK-23761
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark, SQL
>    Affects Versions: 1.6.0
>         Environment: pyspark 1.6.0
> Python 2.6.6 (r266:84292, Aug 18 2016, 15:13:37) 
> [GCC 4.4.7 20120313 (Red Hat 4.4.7-17)] on linux2
> CentOS 6.7
>            Reporter: Dhaniram Kshirsagar
>            Priority: Major
>
> On pyspark with dataframe, we are getting following exception when 
> 'filter(with UDF) is followed by groupby' :-
> # Snippet of error observed in pyspark
> {code:java}
> py4j.protocol.Py4JJavaError: An error occurred while calling o56.filter.
> : java.lang.ClassCastException: 
> org.apache.spark.sql.catalyst.plans.logical.Project cannot be cast to 
> org.apache.spark.sql.catalyst.plans.logical.Aggregate{code}
> This one looks like https://issues.apache.org/jira/browse/SPARK-12981 however 
> not sure if this one is same.
>  
> Here is gist with pyspark steps to reproduce this issue:
> [https://gist.github.com/dhaniram-kshirsagar/d72545620b6a05d145a1a6bece797b6d]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to