Github user MaxGekk commented on a diff in the pull request: https://github.com/apache/spark/pull/21296#discussion_r187810542 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala --- @@ -1322,4 +1322,31 @@ class CSVSuite extends QueryTest with SharedSQLContext with SQLTestUtils with Te val sampled = spark.read.option("inferSchema", true).option("samplingRatio", 1.0).csv(ds) assert(sampled.count() == ds.count()) } + + test("SPARK-24244: Select a little of many columns") { --- End diff -- I added the test to check that requesting only subset of all columns works correctly. And to check the case when ordering of fields in required schema is different from the data schema. Previously I had a concern that if I select columns in different order like `select('f15, 'f10, 'f5)`, I will get the required schema with the same field order. It seems the required schema has the same order as data schema. That's why I removed https://github.com/apache/spark/pull/21296/files/a4a0a549156a15011c33c7877a35f244d75b7a4f#diff-d19881aceddcaa5c60620fdcda99b4c4L214
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org