I would try with the same version as Spark uses first. I don’t have the changelog of parquet in my head (but you can find it ok the Internet), but it could be the cause of your issues.
> Am 31.10.2018 um 12:26 schrieb lchorbadjiev <lubomir.chorbadj...@gmail.com>: > > Hi Jorn, > > I am using Apache Spark 2.3.1. > > For creating the parquet file I have used Apache Parquet (parquet-mr) 1.10. > This does not match the version of parquet used in Apache Spark 2.3.1 and if > you think that this could be the problem I could try to use Apache Parquet > version 1.8.3. > > I created a parquet file using Apache Spark SQL types, but can not make the > resulting schema to match the schema described in the paper. > > What I do is to use Spark SQL array type for repeated values. For example, > where papers says > > repeated int64 Backward; > > I use array type: > > StructField("Backward", ArrayType(IntegerType(), containsNull=False), > nullable=False) > > The resulting schema, reported by parquet-tools is: > > optional group backward (LIST) { > repeated group list { > required int32 element; > } > } > > Thanks, > Lubomir Chorbadjiev > > > > -- > Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org