how to read it using parquet tools. When I did hadoop parquet.tools.Main meta prquetfilename
I didn't get any info of min and max values. How can I see parquet version of my file.Is min max respective to some parquet version or available since beginning? On Mon, Dec 7, 2015 at 6:51 PM, Singh, Abhijeet <absi...@informatica.com> wrote: > Yes, Parquet has min/max. > > > > *From:* Cheng Lian [mailto:l...@databricks.com] > *Sent:* Monday, December 07, 2015 11:21 AM > *To:* Ted Yu > *Cc:* Shushant Arora; user@spark.apache.org > *Subject:* Re: parquet file doubts > > > > Oh sorry... At first I meant to cc spark-user list since Shushant and I > had been discussed some Spark related issues before. Then I realized that > this is a pure Parquet issue, but forgot to change the cc list. Thanks for > pointing this out! Please ignore this thread. > > Cheng > > On 12/7/15 12:43 PM, Ted Yu wrote: > > Cheng: > > I only see user@spark in the CC. > > > > FYI > > > > On Sun, Dec 6, 2015 at 8:01 PM, Cheng Lian <l...@databricks.com> wrote: > > cc parquet-dev list (it would be nice to always do so for these general > questions.) > > Cheng > > On 12/6/15 3:10 PM, Shushant Arora wrote: > > Hi > > I have few doubts on parquet file format. > > 1.Does parquet keeps min max statistics like in ORC. how can I see parquet > version(whether its1.1,1.2or1.3) for parquet file generated using hive or > custom MR or AvroParquetoutputFormat. > > Yes, Parquet also keeps row group statistics. You may check the Parquet > file using the parquet-meta CLI tool in parquet-tools (see > https://github.com/Parquet/parquet-mr/issues/321 for details), then look > for the "creator" field of the file. For programmatic access, check for > o.a.p.hadoop.metadata.FileMetaData.createdBy. > > > 2.how to sort parquet records while generating parquet file using > avroparquetoutput format? > > AvroParquetOutputFormat is not a format. It's just responsible for > converting Avro records to Parquet records. How are you using > AvroParquetOutputFormat? Any example snippets? > > > Thanks > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > > > > >