Shawn Guo created SPARK-4533:
--------------------------------

             Summary: Can only subtract another SchemaRDD
                 Key: SPARK-4533
                 URL: https://issues.apache.org/jira/browse/SPARK-4533
             Project: Spark
          Issue Type: Bug
          Components: PySpark
    Affects Versions: 1.1.0
         Environment: JDK6/7
            Reporter: Shawn Guo
            Priority: Minor


There are two unexpected validations in below SchemaRDD APIs. 
subtract(self, other, numPartitions=None)
        "Can only subtract another SchemaRDD"
intersection(self, other)
        "Can only intersect with another SchemaRDD"

"Can only subtract another SchemaRDD" will be thrown when SchemaRDD subtract 
other types of RDD.

Reproduce Steps:
A = SchemaRDD
B = SchemaRDD

A_APX= A.keyBy(lambda line: None)
B_APX=B.keyBy(lambda line: None)
{color:red}
CROSSED = A_APX.join(B_APX).map(lambda line: line[1]).filter(filter 
condition).map(lambda line: line[0]))
{color}
C=A.subtract(CROSSED)  {color:red}#ERROR:Can only subtract another 
SchemaRDD{color}







--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to