Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/21380#discussion_r189828536
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityParser.scala
---
@@ -300,14 +302,11 @@ private[csv] object UnivocityParser {
lines
}
- val filteredLines: Iterator[String] =
- CSVUtils.filterCommentAndEmpty(linesWithoutHeader, options)
--- End diff --
@MaxGekk, It doesn't always mean that we have tests. Because it was there
from the first place and I tried to remove it, then the tests were broken. I
expected to be broken again but seems passed now. So, I'm just guessing that
it's fixed.
Usually we trust but we should be careful if there were some issues found.
I don't think we should make this case special. I am not seeing meaningful
improvement either.
One nit is, BTW, the purpose of `ignoreLeadingWhiteSpaceInRead` and
`ignoreTrailingWhiteSpaceInRead` are basically for trimming the whitespaces in
the values not for skipping empty lines.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]