maropu commented on a change in pull request #33212:
URL: https://github.com/apache/spark/pull/33212#discussion_r671026357
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala
##########
@@ -418,6 +424,22 @@ class JacksonParser(
}
}
+ // When the input schema is setting to `nullable = false`, make sure the
field is not null.
+ // As PERMISSIVE mode only works with nullable fields, we can skip this
not nullable check when
+ // the mode is PERMISSIVE. (see
FailureSafeParser.checkNullabilityForPermissiveMode)
+ val checkNotNullable =
+ badRecordException.isEmpty && !skipRow && options.parseMode !=
PermissiveMode
+ if (checkNotNullable) {
+ var index = 0
+ while (index < schema.length) {
+ if (!schema(index).nullable && row.isNullAt(index)) {
+ throw new IllegalSchemaArgumentException(
+ s"field ${schema(index).name} is not nullable but it's missing in
one record.")
+ }
+ index += 1
+ }
+ }
Review comment:
Could we extract this part as a private method like this?
```
private lazy val checkNotNullableInRow = if (options.parseMode !=
PermissiveMode) {
(row: GenericInternalRow, schema: StructType, skipRow: Boolean,
runtimeExceptionOption: Option[Throwable]) => {
val checkNotNullable =
runtimeExceptionOption.isEmpty && !skipRow && options.parseMode !=
PermissiveMode
if (checkNotNullable) {
...
}
}
}
} else {
(_: GenericInternalRow, _: StructType, _: Boolean, _: Option[Throwable])
=> {}
}
```
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala
##########
@@ -405,10 +405,16 @@ class JacksonParser(
schema.getFieldIndex(parser.getCurrentName) match {
case Some(index) =>
try {
- row.update(index, fieldConverters(index).apply(parser))
+ val fieldValue = fieldConverters(index).apply(parser)
+ if (!schema(index).nullable && fieldValue == null) {
Review comment:
We still need this check for the permissive mode?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]