Pedro Correia Luis created SPARK-29594:
------------------------------------------
Summary: Create a Dataset from a Sequence of Case class
Key: SPARK-29594
URL: https://issues.apache.org/jira/browse/SPARK-29594
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 2.4.4
Reporter: Pedro Correia Luis
The Dataset code generation logic fails to handle field-names in case classes
(e.g. "1_something"). Scala has an escaping mechanism (using backquotes) that
allows Java (and Scala) keywords to be used as names in programs, as in the
example below:
case class Foo(`1_something`: String)
val test = Seq(Foo("HelloWorld!")).toDS()
But this case class trips up the Dataset code generator. The following error
message is displayed when Datasets containing instances of such case classes
are processed.
java.lang.RuntimeException: Error while encoding:
java.util.concurrent.ExecutionException:
org.codehaus.commons.compiler.CompileException: File 'generated.java', Line
316, Column 15: failed to compile:
org.codehaus.commons.compiler.CompileException: File 'generated.java', Line
316, Column 15: Expression "funcResult_2 = value_19" is not a type[0m
[31mstaticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType,
fromString, unwrapoption(ObjectType(class java.lang.String),
assertnotnull(assertnotnull(input[0, Foo, true])).1_something), true, false) AS
1_something#40[0m
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]