wayneguow commented on pull request #34860:
URL: https://github.com/apache/spark/pull/34860#issuecomment-995425112


   Maybe a kind of possible approach to avoid a breaking change is that adding 
another boolean option in CSVOptions such as `parseEmptyValueAsEmpty` which is 
default set as false. And with this option, we can change the `nullSafeDatum` 
method in UnivocityParser as follows:
   ```
   private def nullSafeDatum(
        datum: String,
        name: String,
        nullable: Boolean,
        options: CSVOptions)(converter: ValueConverter): Any = {
     if (datum == options.nullValue || datum == null) {
       if (!nullable) {
         throw new RuntimeException(s"null value found but field $name is not 
nullable.")
       }
       null
     } else if (options.parseEmptyValueAsEmpty && datum == 
options.emptyValueInRead) {
       converter.apply("")
     } else {
       converter.apply(datum)
     }
   }
   ```
   With this option, the default behavior is same with before, but for users 
such as me, we can succeed to parse emptyValue strings to "" when reading csv 
files. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to