Hi Jorn,

I am using Apache Spark 2.3.1.

For creating the parquet file I have used Apache Parquet (parquet-mr) 1.10.
This does not match the version of parquet used in Apache Spark 2.3.1 and if
you think that this could be the problem I could try to use Apache Parquet
version 1.8.3.

I created a parquet file using Apache Spark SQL types, but can not make the
resulting schema to match the schema described in the paper.

What I do is to use Spark SQL array type for repeated values. For example,
where papers says

    repeated int64 Backward;

I use array type:

    StructField("Backward", ArrayType(IntegerType(), containsNull=False),
nullable=False)
    
The resulting schema, reported by parquet-tools is:

    optional group backward (LIST) {
      repeated group list {
        required int32 element;
      }
    }

Thanks,
Lubomir Chorbadjiev



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to