Re: Spark SQL, Hive & Parquet data types

Cheng Lian Mon, 23 Feb 2015 04:29:01 -0800

Yes, recently we improved ParquetRelation2 quite a bit. Spark SQL usesits own Parquet support to read partitioned Parquet tables declared inHive metastore. Only writing to partitioned tables is not covered yet.These improvements will be included in Spark 1.3.0.


Just created SPARK-5948 to track writing to partitioned Parquet tables.


Cheng

On 2/20/15 10:58 PM, The Watcher wrote:


    1. In Spark 1.3.0, timestamp support was added, also Spark SQL uses
    its own Parquet support to handle both read path and write path when
    dealing with Parquet tables declared in Hive metastore, as long as you’re
    not writing to a partitioned table. So yes, you can.

Ah, I had missed the part about being partitioned or not. Is this related

to the work being done on ParquetRelation2 ?

We will indeed write to a partitioned table : do neither the read nor the
write path go through Spark SQL's parquet support in that case ? Is there a
JIRA/PR I can monitor to see when this would change ?

Thanks



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Re: Spark SQL, Hive & Parquet data types

Reply via email to