Chris Suchanek created SPARK-30989:
--------------------------------------
Summary: TABLE.COLUMN reference doesn't work with new columns
created by UDF
Key: SPARK-30989
URL: https://issues.apache.org/jira/browse/SPARK-30989
Project: Spark
Issue Type: Bug
Components: Spark Core
Affects Versions: 2.4.4
Reporter: Chris Suchanek
When a dataframe is created with an alias (`.as("...")`) its columns can be
referred as `TABLE.COLUMN` but it doesn't work for newly created columns with
UDF.
{code:java}
// code placeholder
df1 = sc.parallelize(l).toDF("x","y").as("cat")
val squared = udf((s: Int) => s * s)
val df2 = df1.withColumn("z", squared(col("y")))
df2.columns //Array[String] = Array(x, y, z)
df2.select("cat.x") // works
df2.select("cat.z") // Doesn't work
// org.apache.spark.sql.AnalysisException: cannot resolve '`cat.z`' given input
// columns: [cat.x, cat.y, z];;
{code}
Might be related to: https://issues.apache.org/jira/browse/SPARK-30532
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]