mickjermsurawong-stripe commented on pull request #33205: URL: https://github.com/apache/spark/pull/33205#issuecomment-894881771
That's correct @srowen. Nested AnyVal class does not work currently. Value class in nested schema 1) currently does not work because the schema described has AnyVal class 2) but when accessing that nested value actually has unwrapped type `int` 3), resulting in this exception 4). Essentially, we currently describe schema in an incompatible way with how AnyVal class operates "The type at compile time is Wrapper, but at runtime, the representation is an Int". ([doc](https://docs.scala-lang.org/overviews/core/value-classes.html)) ``` private InternalRow If_1(InternalRow i) { boolean isNull_42 = i.isNullAt(0); ########################## 1) The root-level case class we care ########################## org.apache.spark.sql.catalyst.encoders.ComplexValueClassContainer value_46 = isNull_42 ? null : ((org.apache.spark.sql.catalyst.encoders.ComplexValueClassContainer) i.get(0, null)); if (isNull_42) { throw new NullPointerException(((java.lang.String) references[5] /* errMsg */ )); } boolean isNull_39 = true; ########################## 2) We specify its member to be unwrapped case class extending `AnyVal` org.apache.spark.sql.catalyst.encoders.IntWrapper value_43 = null; if (!false) { isNull_39 = false; if (!isNull_39) { ########################## 3) ******** ERROR: `c()` compiled however is of type `int` and thus we see error value_43 = value_46.c(); } } ``` 4) ``` java.util.concurrent.ExecutionException: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 159, Column 1: failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 159, Column 1: Assignment conversion not possible from type "int" to type "org.apache.spark.sql.catalyst.encoders.IntWrapper" ``` To your specific clarification "because it doesn't work at all now": it _does_ work in one case of value class in parameterized class like `Seq[AnyVal]`. This is because there is no unwrapping, and the wrapper remains as-is. From the same scala doc [ref](https://docs.scala-lang.org/overviews/core/value-classes.html), `Wrapper` "must be instantiated... when a value class is used as a type argument". This implies that `scala.Tuple[Wrapper, ...], Seq[Wrapper], Map[String, Wrapper], Option[Wrapper]` will still contain Wrapper as-is in during runtime instead of `Int`. This fix will also resolve schema issue [SPARK-20384 ](https://issues.apache.org/jira/browse/SPARK-20384) originally described; the reporter will be able to access the value class in an unwrapped fashion. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
