Thanks Fokko,
I will definitely take a look at this.
Cheers Andrew
From: "Driesprong, Fokko"
Date: Friday, August 24, 2018 at 2:39 AM
To: "reubensaw...@hotmail.com"
Cc: "dev@spark.apache.org"
Subject: Re: Spark data quality bug when reading parquet files f
Hi Andrew,
This blog gives an idea how to schema is resolved:
https://blog.godatadriven.com/multiformat-spark-partition There is some
optimisation going on when reading Parquet using Spark. Hope this helps.
Cheers, Fokko
Op wo 22 aug. 2018 om 23:59 schreef t4 :
>
https://issues.apache.org/jira/browse/SPARK-23576 ?
--
Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/
-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
Hello Friends,
I’ve encountered a bug where spark silently corrupts data when reading from a
parquet hive table where the table schema does not match the file schema. I’d
like to give a shot at adding some extra validations to the code to handle this
corner case and I was wondering if anyone