wayneguow commented on pull request #34860:
URL: https://github.com/apache/spark/pull/34860#issuecomment-995425112
Maybe a kind of possible approach to avoid a breaking change is that adding
another boolean option in CSVOptions such as `parseEmptyValueAsEmpty` which is
default set as false. And with this option, we can change the `nullSafeDatum`
method in UnivocityParser as follows:
```
private def nullSafeDatum(
datum: String,
name: String,
nullable: Boolean,
options: CSVOptions)(converter: ValueConverter): Any = {
if (datum == options.nullValue || datum == null) {
if (!nullable) {
throw new RuntimeException(s"null value found but field $name is not
nullable.")
}
null
} else if (options.parseEmptyValueAsEmpty && datum ==
options.emptyValueInRead) {
converter.apply("")
} else {
converter.apply(datum)
}
}
```
With this option, the default behavior is same with before, but for users
such as me, we can succeed to parse emptyValue strings to "" when reading csv
files.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]