Hi Jorn, I am using Apache Spark 2.3.1.
For creating the parquet file I have used Apache Parquet (parquet-mr) 1.10. This does not match the version of parquet used in Apache Spark 2.3.1 and if you think that this could be the problem I could try to use Apache Parquet version 1.8.3. I created a parquet file using Apache Spark SQL types, but can not make the resulting schema to match the schema described in the paper. What I do is to use Spark SQL array type for repeated values. For example, where papers says repeated int64 Backward; I use array type: StructField("Backward", ArrayType(IntegerType(), containsNull=False), nullable=False) The resulting schema, reported by parquet-tools is: optional group backward (LIST) { repeated group list { required int32 element; } } Thanks, Lubomir Chorbadjiev -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org