Cheng Lian created SPARK-9272: --------------------------------- Summary: Persist information of individual partitions when persisting partitioned data source tables to metastore Key: SPARK-9272 URL: https://issues.apache.org/jira/browse/SPARK-9272 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 1.5.0 Reporter: Cheng Lian
Currently, when a partitioned data source table is persisted to Hive metastore, we only persist its partition columns. Information about individual partitions are not persisted. This forces us to do a partition discovery before reading a persisted partitioned table, which hurts performance. To fix this issue, we may persist partition information into metastore. Specifically, the format should be compatible with Hive to ensure interoperability. One of the approach to collect partition values and partition directory path for dynamicly partitioned tables is to use accumulators to collect expected information during the write job. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org