pralabhkumar commented on code in PR #37009:
URL: https://github.com/apache/spark/pull/37009#discussion_r926888668
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityParser.scala:
##########
@@ -317,7 +317,15 @@ class UnivocityParser(
if (skipRow) {
row.setNullAt(i)
} else {
- row(i) = valueConverters(i).apply(getToken(tokens, i))
+ // This is required to not set value as null ,
+ // 1. If the missing value at the end of line.
+ // 2. If the missing value at the beginning of line.
+ if (!options.naFilter && (i>=tokens.length ||
+ (i==0 && getToken(tokens, i).length == 0))) {
+ row(i) = valueConverters(i).apply("")
Review Comment:
This is only called if the missing values in the beginning or end . In that
case current code goes into exception block and update row with default values
(which will replace missing values with null) . In case option.naFilter is
False , we do not want to replace with null and let them be missing value
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]