Liang-Chi Hsieh created SPARK-6607:
--------------------------------------

             Summary: Aggregation attribute name including special chars '(' 
and ')' should be replaced before generating Parquet schema
                 Key: SPARK-6607
                 URL: https://issues.apache.org/jira/browse/SPARK-6607
             Project: Spark
          Issue Type: Bug
          Components: SQL
            Reporter: Liang-Chi Hsieh


'(' and ')' are special characters used in Parquet schema for type annotation. 
When we run an aggregation query, we will obtain attribute name such as 
"MAX(a)".

If we directly store the generated DataFrame as Parquet file, it causes failure 
when reading and parsing the stored schema string.

Several methods can be adopted to solve this. This pr uses a simplest one to 
just replace attribute names before generating Parquet schema based on these 
attributes.

Another possible method might be modifying all aggregation expression names 
from "func(column)" to "func[column]".




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to