shardulm94 commented on code in PR #37634:
URL: https://github.com/apache/spark/pull/37634#discussion_r954381765
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeProjection.scala:
##########
@@ -252,28 +264,43 @@ object GenerateUnsafeProjection extends
CodeGenerator[Seq[Expression], UnsafePro
""".stripMargin
}
+ /**
+ * Wrap `inputExpr` in a try-catch block that will catch any
[[NullPointerException]] that is
+ * thrown, instead throwing a (more helpful) error message as provided by
+ *
[[org.apache.spark.sql.errors.QueryExecutionErrors.valueCannotBeNullError]].
+ */
+ private def wrapWithNpeHandling(inputExpr: String, descPath: Seq[String]):
String =
+ s"""
+ |try {
+ | ${inputExpr.trim}
+ |} catch (NullPointerException npe) {
+ | throw
QueryExecutionErrors.valueCannotBeNullError("${descPath.mkString(".")}");
Review Comment:
A column with `"` as part of the column name will break the syntax in this
generated code. We should escape quotes here. I tried to look if other errors
escaped quotes in user controlled strings, but wasn't able to find any. e.g
https://github.com/apache/spark/blob/295e98d29b34e2b472c375608b8782c3b9189444/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala#L1165
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala:
##########
@@ -447,6 +447,13 @@ private[sql] object QueryExecutionErrors extends
QueryErrorsBase {
new RuntimeException(fieldCannotBeNullMsg(index, fieldName))
}
+ def valueCannotBeNullError(locationDesc: String): RuntimeException = {
+ new RuntimeException(s"The value at $locationDesc cannot be null, but a
NULL was found. " +
+ "This is typically caused by the presence of a NULL value when the
schema indicates the " +
+ "value should be non-null. Check that the input data matches the schema
and/or that UDFs " +
Review Comment:
Do the changes in `GenerateUnsafeProjection` also take care of the throwing
useful error messages for non-nullable UDFs returning null values? If yes, can
we also add a test case for the same?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]