Carlos created SPARK-26336:
------------------------------

             Summary: left_anti join with Na Values
                 Key: SPARK-26336
                 URL: https://issues.apache.org/jira/browse/SPARK-26336
             Project: Spark
          Issue Type: Bug
          Components: PySpark
    Affects Versions: 2.2.0
            Reporter: Carlos


When I'm joining two dataframes with data that haves NA values, the left_anti 
join don't work as well, cause don't detect registers with NA values.

Example:  
{code:java}
from pyspark.sql import SparkSession
from pyspark.sql.functions import *

spark = SparkSession.builder.appName('test').enableHiveSupport().getOrCreate()


data = [(1,"Test"),(2,"Test"),(3,None)]
df1 = spark.createDataFrame(data,("id","columndata"))
df2 = spark.createDataFrame(data,("id","columndata"))

df_joined = df1.join(df2, df1.columns,'left_anti'){code}
df_joined have data, when two dataframe are the same.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to