huizhong xu created SPARK-44207:
-----------------------------------
Summary: Where Clause throwing Resolved attribute(s) _metadata#398
missing from ... error
Key: SPARK-44207
URL: https://issues.apache.org/jira/browse/SPARK-44207
Project: Spark
Issue Type: Question
Components: SQL
Affects Versions: 3.3.1
Reporter: huizhong xu
i have 2 data frames called lt and rt, both with same schema and only 1 row,
generated separately by our own curation logic, all the columns are either
String, boolean or Timestamp, i am trying to compare them, and i am running a
join on two like this
var joinedDF = lt.join(rt, "Id")
after that, i am trying to compare them by schema fist and then by each
column, how many % of rows are same,
code is kindof like this
for (column <- lt.schema) {
if (rt.columns.contains(column.name) &&
column.dataType == rt.schema(column.name).dataType) {
var matchCount = joinedCount
if (column.dataType.typeName == "string") {
matchCount = joinedDF.where((lt(column.name) <=>
rt(column.name))).count}
else
.....
on the last line where i am running a where clause, it is throwing an error
called AnalysisException Resolved attribute(s) _metadata#398 missing from ....,
i don't even have this _metadata column anywhere in my dataframe at all
and i searched online people are saying it is a problem of join, i tried to
change the colunm names in rt and joinedDF, both doesn't work, same error is
still thrown, can anybody help here
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]