mickjermsurawong-stripe commented on pull request #33205:
URL: https://github.com/apache/spark/pull/33205#issuecomment-894881771


   That's correct @srowen. Nested AnyVal class does not work currently. 
   Value class in nested schema 1) currently does not work because the schema 
described has AnyVal class 2) but when accessing that nested value actually has 
unwrapped type `int` 3), resulting in this exception 4). Essentially, we 
currently describe schema in an incompatible way with how AnyVal class operates 
"The type at compile time is Wrapper, but at runtime, the representation is an 
Int". ([doc](https://docs.scala-lang.org/overviews/core/value-classes.html))
   ```
       private InternalRow If_1(InternalRow i) {
           boolean isNull_42 = i.isNullAt(0);
   
   ########################## 1) The root-level case class we care 
##########################
   
           org.apache.spark.sql.catalyst.encoders.ComplexValueClassContainer 
value_46 = isNull_42 ?
               null : 
((org.apache.spark.sql.catalyst.encoders.ComplexValueClassContainer) i.get(0, 
null));
           if (isNull_42) {
               throw new NullPointerException(((java.lang.String) references[5] 
/* errMsg */ ));
           }
           boolean isNull_39 = true;
   
   ########################## 2) We specify its member to be unwrapped case 
class extending `AnyVal`
   
           org.apache.spark.sql.catalyst.encoders.IntWrapper value_43 = null;
           if (!false) {
   
               isNull_39 = false;
               if (!isNull_39) {
   
   ########################## 3) ******** ERROR: `c()` compiled however is of 
type `int` and thus we see error
   
                   value_43 = value_46.c();
               }
           }
   ```
   4) 
   ```
   java.util.concurrent.ExecutionException: 
org.codehaus.commons.compiler.CompileException: 
   File 'generated.java', Line 159, Column 1: failed to compile: 
org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 
159, Column 1: Assignment conversion not possible from type "int" to type 
"org.apache.spark.sql.catalyst.encoders.IntWrapper"
   ```
   
   To your specific clarification "because it doesn't work at all now": it 
_does_ work in one case of value class in parameterized class like 
`Seq[AnyVal]`. This is because there is no unwrapping, and the wrapper remains 
as-is. From the same scala doc 
[ref](https://docs.scala-lang.org/overviews/core/value-classes.html), `Wrapper` 
"must be instantiated... when a value class is used as a type argument". This 
implies that `scala.Tuple[Wrapper, ...], Seq[Wrapper], Map[String, Wrapper], 
Option[Wrapper]` will still contain Wrapper as-is in during runtime instead of 
`Int`.
   
   This fix will also resolve schema issue [SPARK-20384 
](https://issues.apache.org/jira/browse/SPARK-20384) originally described; the 
reporter will be able to access the value class in an unwrapped fashion. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to