how to read it using parquet tools.
When I did
hadoop parquet.tools.Main meta prquetfilename

I didn't get any info of min and max values.

How can I see parquet version of my file.Is min max respective to some
parquet version or available since beginning?


On Mon, Dec 7, 2015 at 6:51 PM, Singh, Abhijeet <absi...@informatica.com>
wrote:

> Yes, Parquet has min/max.
>
>
>
> *From:* Cheng Lian [mailto:l...@databricks.com]
> *Sent:* Monday, December 07, 2015 11:21 AM
> *To:* Ted Yu
> *Cc:* Shushant Arora; user@spark.apache.org
> *Subject:* Re: parquet file doubts
>
>
>
> Oh sorry... At first I meant to cc spark-user list since Shushant and I
> had been discussed some Spark related issues before. Then I realized that
> this is a pure Parquet issue, but forgot to change the cc list. Thanks for
> pointing this out! Please ignore this thread.
>
> Cheng
>
> On 12/7/15 12:43 PM, Ted Yu wrote:
>
> Cheng:
>
> I only see user@spark in the CC.
>
>
>
> FYI
>
>
>
> On Sun, Dec 6, 2015 at 8:01 PM, Cheng Lian <l...@databricks.com> wrote:
>
> cc parquet-dev list (it would be nice to always do so for these general
> questions.)
>
> Cheng
>
> On 12/6/15 3:10 PM, Shushant Arora wrote:
>
> Hi
>
> I have few doubts on parquet file format.
>
> 1.Does parquet keeps min max statistics like in ORC. how can I see parquet
> version(whether its1.1,1.2or1.3) for parquet file generated using hive or
> custom MR or AvroParquetoutputFormat.
>
> Yes, Parquet also keeps row group statistics. You may check the Parquet
> file using the parquet-meta CLI tool in parquet-tools (see
> https://github.com/Parquet/parquet-mr/issues/321 for details), then look
> for the "creator" field of the file. For programmatic access, check for
> o.a.p.hadoop.metadata.FileMetaData.createdBy.
>
>
> 2.how to sort parquet records while generating parquet file using
> avroparquetoutput format?
>
> AvroParquetOutputFormat is not a format. It's just responsible for
> converting Avro records to Parquet records. How are you using
> AvroParquetOutputFormat? Any example snippets?
>
>
> Thanks
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>
>
>
>

Reply via email to