Liang-Chi Hsieh created SPARK-6607: -------------------------------------- Summary: Aggregation attribute name including special chars '(' and ')' should be replaced before generating Parquet schema Key: SPARK-6607 URL: https://issues.apache.org/jira/browse/SPARK-6607 Project: Spark Issue Type: Bug Components: SQL Reporter: Liang-Chi Hsieh
'(' and ')' are special characters used in Parquet schema for type annotation. When we run an aggregation query, we will obtain attribute name such as "MAX(a)". If we directly store the generated DataFrame as Parquet file, it causes failure when reading and parsing the stored schema string. Several methods can be adopted to solve this. This pr uses a simplest one to just replace attribute names before generating Parquet schema based on these attributes. Another possible method might be modifying all aggregation expression names from "func(column)" to "func[column]". -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org