[
https://issues.apache.org/jira/browse/SPARK-44207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
huizhong xu updated SPARK-44207:
--------------------------------
Issue Type: Bug (was: Question)
> Where Clause throwing Resolved attribute(s) _metadata#398 missing from ...
> error
> --------------------------------------------------------------------------------
>
> Key: SPARK-44207
> URL: https://issues.apache.org/jira/browse/SPARK-44207
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 3.3.1
> Reporter: huizhong xu
> Priority: Major
>
> i have 2 data frames called lt and rt, both with same schema and only 1 row,
> generated separately by our own curation logic, all the columns are either
> String, boolean or Timestamp, i am trying to compare them, and i am running a
> join on two like this
> var joinedDF = lt.join(rt, "Id")
> after that, i am trying to compare them by schema fist and then by each
> column, how many % of rows are same,
> code is kindof like this
> for (column <- lt.schema) {
> if (rt.columns.contains(column.name) &&
> column.dataType == rt.schema(column.name).dataType) {
> var matchCount = joinedCount
> if (column.dataType.typeName == "string") {
> matchCount = joinedDF.where((lt(column.name) <=>
> rt(column.name))).count}
> else
> .....
>
> on the last line where i am running a where clause, it is throwing an error
> called AnalysisException Resolved attribute(s) _metadata#398 missing from
> ...., i don't even have this _metadata column anywhere in my dataframe at all
> and i searched online people are saying it is a problem of join, i tried to
> change the colunm names in rt and joinedDF, both doesn't work, same error is
> still thrown, can anybody help here
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]