Github user MaxGekk commented on a diff in the pull request: https://github.com/apache/spark/pull/20894#discussion_r188966754 --- Diff: python/pyspark/sql/tests.py --- @@ -3040,6 +3040,24 @@ def test_csv_sampling_ratio(self): .csv(rdd, samplingRatio=0.5).schema self.assertEquals(schema, StructType([StructField("_c0", IntegerType(), True)])) + def test_checking_csv_header(self): + tmpPath = tempfile.mkdtemp() + shutil.rmtree(tmpPath) + try: + self.spark.createDataFrame([[1, 1000], [2000, 2]])\ + .toDF('f1', 'f2').write.option("header", "true").csv(tmpPath) + schema = StructType([ + StructField('f2', IntegerType(), nullable=True), + StructField('f1', IntegerType(), nullable=True)]) + df = self.spark.read.option('header', 'true').schema(schema)\ + .csv(tmpPath, enforceSchema=False) + self.assertRaisesRegexp( + Exception, + "CSV file header does not contain the expected fields", --- End diff -- eh, I have already changed the error message as @hvanhovell suggested. What about: ``` CSV header is not conform to the schema ``` So, error message will look like: ``` java.lang.IllegalArgumentException: CSV header is not conform to the schema Header: depth, temperature Schema: temperature, depth CSV file: marina.csv ``` Is it ok for you?
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org