[jira] [Commented] (SPARK-23388) Support for Parquet Binary DecimalType in VectorizedColumnReader
[ https://issues.apache.org/jira/browse/SPARK-23388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362730#comment-16362730 ] Sameer Agarwal commented on SPARK-23388: yes, I agree > Support for Parquet Binary DecimalType in VectorizedColumnReader > > > Key: SPARK-23388 > URL: https://issues.apache.org/jira/browse/SPARK-23388 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.3.0 >Reporter: James Thompson >Assignee: James Thompson >Priority: Major > Fix For: 2.3.1 > > > The following commit to spark removed support for decimal binary types: > [https://github.com/apache/spark/commit/9c29c557635caf739fde942f53255273aac0d7b1#diff-7bdf5fd0ce0b1ccbf4ecf083611976e6R428] > As per the parquet spec, decimal can be used to annotate binary types, so > support should be re-added: > [https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#decimal] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23388) Support for Parquet Binary DecimalType in VectorizedColumnReader
[ https://issues.apache.org/jira/browse/SPARK-23388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361990#comment-16361990 ] Wenchen Fan commented on SPARK-23388: - This is an interoperability problem: although Spark SQL always write out large precision decimal type as fixed-length-byte-array, Parquet spec also allow binary. In Spark 2.3 we may not be able to read parquet files written by other systems because of this bug. cc [~sameerag] shall we include it in Spark 2.3.0? > Support for Parquet Binary DecimalType in VectorizedColumnReader > > > Key: SPARK-23388 > URL: https://issues.apache.org/jira/browse/SPARK-23388 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.3.0 >Reporter: James Thompson >Assignee: James Thompson >Priority: Major > Fix For: 2.3.1 > > > The following commit to spark removed support for decimal binary types: > [https://github.com/apache/spark/commit/9c29c557635caf739fde942f53255273aac0d7b1#diff-7bdf5fd0ce0b1ccbf4ecf083611976e6R428] > As per the parquet spec, decimal can be used to annotate binary types, so > support should be re-added: > [https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#decimal] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23388) Support for Parquet Binary DecimalType in VectorizedColumnReader
[ https://issues.apache.org/jira/browse/SPARK-23388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16360160#comment-16360160 ] Apache Spark commented on SPARK-23388: -- User 'jamesthomp' has created a pull request for this issue: https://github.com/apache/spark/pull/20580 > Support for Parquet Binary DecimalType in VectorizedColumnReader > > > Key: SPARK-23388 > URL: https://issues.apache.org/jira/browse/SPARK-23388 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.3.0 >Reporter: James Thompson >Priority: Major > > The following commit to spark removed support for decimal binary types: > [https://github.com/apache/spark/commit/9c29c557635caf739fde942f53255273aac0d7b1#diff-7bdf5fd0ce0b1ccbf4ecf083611976e6R428] > As per the parquet spec, decimal can be used to annotate binary types, so > support should be re-added: > [https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#decimal] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org