huizhong xu created SPARK-44207:
-----------------------------------

             Summary: Where Clause throwing Resolved attribute(s) _metadata#398 
missing from ... error
                 Key: SPARK-44207
                 URL: https://issues.apache.org/jira/browse/SPARK-44207
             Project: Spark
          Issue Type: Question
          Components: SQL
    Affects Versions: 3.3.1
            Reporter: huizhong xu


i have 2 data frames called lt and rt, both with same schema and only 1 row, 
generated separately by our own curation logic, all the columns are either 
String, boolean or Timestamp, i am trying to compare them, and i am running a 
join on two like this 

var joinedDF = lt.join(rt, "Id")

after that, i am trying to compare them by schema fist and then by  each 
column, how many % of rows are same,

code is kindof like this

for (column <- lt.schema) {
     if (rt.columns.contains(column.name) &&
     column.dataType == rt.schema(column.name).dataType) {

      var matchCount = joinedCount
      if (column.dataType.typeName == "string") {
             matchCount = joinedDF.where((lt(column.name) <=> 
rt(column.name))).count}

else

.....

 

on the last line where i am running a where clause, it is throwing an error 
called AnalysisException Resolved attribute(s) _metadata#398 missing from ...., 
i don't even have this _metadata column anywhere in my dataframe at all

and i searched online people are saying it is a problem of join, i tried to 
change the colunm names in rt and joinedDF, both doesn't work, same error is 
still thrown, can anybody help here



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to