[GitHub] spark pull request #22676: [SPARK-25684][SQL] Organize header related codes ...

MaxGekk Tue, 09 Oct 2018 08:07:15 -0700

Github user MaxGekk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22676#discussion_r223741059
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityParser.scala
 ---
    @@ -330,7 +333,10 @@ private[csv] object UnivocityParser {
       def parseIterator(
           lines: Iterator[String],
           parser: UnivocityParser,
    +      headerChecker: CSVHeaderChecker,
           schema: StructType): Iterator[InternalRow] = {
    +    headerChecker.checkHeaderColumnNames(lines, parser.tokenizer)
    --- End diff --
    
    The same question here. I would prefer to consume the input iterator 
lazily. This is the one of advantage of iterators , it performs an action when 
you explicitly call it (`hasNext` or `next`) comparing to collections, for 
example.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22676: [SPARK-25684][SQL] Organize header related codes ...

Reply via email to