xkrogen commented on code in PR #37634:
URL: https://github.com/apache/spark/pull/37634#discussion_r974499166


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeProjection.scala:
##########
@@ -252,28 +267,44 @@ object GenerateUnsafeProjection extends 
CodeGenerator[Seq[Expression], UnsafePro
      """.stripMargin
   }
 
+  /**
+   * Wrap `inputExpr` in a try-catch block that will catch any 
[[NullPointerException]] that is
+   * thrown, instead throwing a (more helpful) error message as provided by
+   * 
[[org.apache.spark.sql.errors.QueryExecutionErrors.valueCannotBeNullError]].
+   */
+  private def wrapWithNpeHandling(inputExpr: String, descPath: Seq[String]): 
String =
+    s"""
+       |try {
+       |  ${inputExpr.trim}

Review Comment:
   I prefer exception-catching as it handles this issue with zero overhead. 
Adding a null-check here essentially falls back to the logic for a nullable 
schema:
   
https://github.com/apache/spark/blob/0494dc90af48ce7da0625485a4dc6917a244d580/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeProjection.scala#L119-L133
   From the benchmark results, we can see that there is nontrivial overhead for 
the null-check; for the simple case of a projection of a primitive, the 
overhead is almost 50%:
   
https://github.com/apache/spark/blob/2a1f9767213c321bd52e7714fa3b5bfc4973ba40/sql/catalyst/benchmarks/UnsafeProjectionBenchmark-jdk17-results.txt#L9-L10
   
   You call out the situation of a null silently being replaced with a default 
value; this is a good point. I'm not sure how we can handle that without 
additional overhead of an explicit check. It seems that the default value 
replacement logic is coming from [Scala's own unboxing 
logic](https://github.com/scala/scala/blob/986dcc160aab85298f6cab0bf8dd0345497cdc01/src/library/scala/runtime/BoxesRunTime.java#L102).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to