I would try with the same version as Spark uses first. I don’t have the 
changelog of parquet in my head (but you can find it ok the Internet), but it 
could be the cause of your issues.

> Am 31.10.2018 um 12:26 schrieb lchorbadjiev <lubomir.chorbadj...@gmail.com>:
> 
> Hi Jorn,
> 
> I am using Apache Spark 2.3.1.
> 
> For creating the parquet file I have used Apache Parquet (parquet-mr) 1.10.
> This does not match the version of parquet used in Apache Spark 2.3.1 and if
> you think that this could be the problem I could try to use Apache Parquet
> version 1.8.3.
> 
> I created a parquet file using Apache Spark SQL types, but can not make the
> resulting schema to match the schema described in the paper.
> 
> What I do is to use Spark SQL array type for repeated values. For example,
> where papers says
> 
>    repeated int64 Backward;
> 
> I use array type:
> 
>    StructField("Backward", ArrayType(IntegerType(), containsNull=False),
> nullable=False)
> 
> The resulting schema, reported by parquet-tools is:
> 
>    optional group backward (LIST) {
>      repeated group list {
>        required int32 element;
>      }
>    }
> 
> Thanks,
> Lubomir Chorbadjiev
> 
> 
> 
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
> 
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
> 

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to