Re: Schema From parquet file

2015-09-01 Thread Cheng Lian

What exactly do you mean by "get schema from a parquet file"?

- If you are trying to inspect Parquet files, parquet-tools can be 
pretty neat: https://github.com/Parquet/parquet-mr/issues/321
- If you are trying to get Parquet schema of Parquet MessageType, you 
may resort to readFooterX() and readAllFootersX() utility methods in 
ParquetFileReader
- If you are trying to get Spark SQL StructType schema out of a Parquet 
file, then the most convenient way is to load it as a DataFrame. 
However, "loading" it as a DataFrame doesn't mean we scan the whole 
file. Instead, we only try to do minimum metadata discovery work like 
schema discovery and schema merging.


Cheng

On 9/1/15 7:07 PM, Hafiz Mujadid wrote:

Hi all!

Is there any way to get schema from a parquet file without loading into
dataframe?

Thanks



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Schema-From-parquet-file-tp24535.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org





-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Schema From parquet file

2015-09-01 Thread Hafiz Mujadid
Hi all!

Is there any way to get schema from a parquet file without loading into
dataframe?

Thanks



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Schema-From-parquet-file-tp24535.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org