Github user MaxGekk commented on a diff in the pull request: https://github.com/apache/spark/pull/22676#discussion_r223741059 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityParser.scala --- @@ -330,7 +333,10 @@ private[csv] object UnivocityParser { def parseIterator( lines: Iterator[String], parser: UnivocityParser, + headerChecker: CSVHeaderChecker, schema: StructType): Iterator[InternalRow] = { + headerChecker.checkHeaderColumnNames(lines, parser.tokenizer) --- End diff -- The same question here. I would prefer to consume the input iterator lazily. This is the one of advantage of iterators , it performs an action when you explicitly call it (`hasNext` or `next`) comparing to collections, for example.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org