[jira] [Comment Edited] (SPARK-40439) DECIMAL value with more precision than what is defined in the schema raises exception in SparkSQL but evaluates to NULL for DataFrame
[ https://issues.apache.org/jira/browse/SPARK-40439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607314#comment-17607314 ] xsys edited comment on SPARK-40439 at 9/20/22 5:23 PM: --- [~hyukjin.kwon]: Thank you for your response! Setting {{spark.sql.storeAssignmentPolicy}} to LEGACY works. However, I believe it could get non-trivial for users to discover that {{spark.sql.storeAssignmentPolicy}} would work. For instance, after inspecting the code, I thought nullOnOverflow is controlled by {{spark.sql.ansi.enabled. }}I tried to achieve the desired behavior by altering it (but to no avail). Could we add the usage of {{spark.sql.storeAssignmentPolicy}} to {{LEGACY}} to the error message? was (Author: JIRAUSER288838): [~hyukjin.kwon]: Thank you for your response! Setting {{spark.sql.storeAssignmentPolicy}} to LEGACY works. However, I believe it could get non-trivial for users to discover that {{spark.sql.storeAssignmentPolicy}} would work. For instance, after inspecting the code, I thought that nullOnOverflow is controlled by {{spark.sql.ansi.enabled. I}} tried to achieve the desired behaviour by altering it (but to no avail). Could we add the usage of {{spark.sql.storeAssignmentPolicy}} to {{LEGACY}} to the error message? > DECIMAL value with more precision than what is defined in the schema raises > exception in SparkSQL but evaluates to NULL for DataFrame > - > > Key: SPARK-40439 > URL: https://issues.apache.org/jira/browse/SPARK-40439 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.2.1 >Reporter: xsys >Priority: Major > > h3. Describe the bug > We are trying to store a DECIMAL value {{333.22}} with more > precision than what is defined in the schema: {{{}DECIMAL(20,10){}}}. This > leads to a {{NULL}} value being stored if the table is created using > DataFrames via {{{}spark-shell{}}}. However, it leads to the following > exception if the table is created via {{{}spark-sql{}}}: > {code:java} > Failed in [insert into decimal_extra_precision select 333.22] > java.lang.ArithmeticException: > Decimal(expanded,333.22,21,10}) cannot be represented as > Decimal(20, 10){code} > h3. Step to reproduce: > On Spark 3.2.1 (commit {{{}4f25b3f712{}}}), using {{{}spark-sql{}}}: > {code:java} > create table decimal_extra_precision(c1 DECIMAL(20,10)) STORED AS ORC; > insert into decimal_extra_precision select 333.22;{code} > Execute the following: > {code:java} > create table decimal_extra_precision(c1 DECIMAL(20,10)) STORED AS ORC; > insert into decimal_extra_precision select 333.22;{code} > h3. Expected behavior > We expect the two Spark interfaces ({{{}spark-sql{}}} & {{{}spark-shell{}}}) > to behave consistently for the same data type & input combination > ({{{}DECIMAL(20,10){}}} and {{{}333.22{}}}). > Here is a simplified example in {{{}spark-shell{}}}, where insertion of the > aforementioned decimal value evaluates to a {{{}NULL{}}}: > {code:java} > scala> import org.apache.spark.sql.{Row, SparkSession} > import org.apache.spark.sql.{Row, SparkSession} > scala> import org.apache.spark.sql.types._ > import org.apache.spark.sql.types._ > scala> val rdd = > sc.parallelize(Seq(Row(BigDecimal("333.22" > rdd: org.apache.spark.rdd.RDD[org.apache.spark.sql.Row] = > ParallelCollectionRDD[0] at parallelize at :27 > scala> val schema = new StructType().add(StructField("c1", DecimalType(20, > 10), true)) > schema: org.apache.spark.sql.types.StructType = > StructType(StructField(c1,DecimalType(20,10),true)) > scala> val df = spark.createDataFrame(rdd, schema) > df: org.apache.spark.sql.DataFrame = [c1: decimal(20,10)] > scala> df.show() > ++ > | c1| > ++ > |null| > ++ > scala> > df.write.mode("overwrite").format("orc").saveAsTable("decimal_extra_precision") > 22/08/29 10:33:47 WARN SessionState: METASTORE_FILTER_HOOK will be ignored, > since hive.security.authorization.manager is set to instance of > HiveAuthorizerFactory. > scala> spark.sql("select * from decimal_extra_precision;") > res2: org.apache.spark.sql.DataFrame = [c1: decimal(20,10)] > {code} > h3. Root Cause > The exception is being raised from > [Decimal|https://github.com/apache/spark/blob/v3.2.1/sql/catalyst/src/main/scala/org/apache/spark/sql/types/Decimal.scala#L358-L373] > ({{{}nullOnOverflow{}}} is controlled by {{spark.sql.ansi.enabled}} in > [SQLConf|https://github.com/apache/spark/blob/v3.2.1/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L2542-L2551].): > {code:java} > private[sql] def toPrecision( > precision: Int, >
[jira] [Comment Edited] (SPARK-40439) DECIMAL value with more precision than what is defined in the schema raises exception in SparkSQL but evaluates to NULL for DataFrame
[ https://issues.apache.org/jira/browse/SPARK-40439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607314#comment-17607314 ] xsys edited comment on SPARK-40439 at 9/20/22 5:23 PM: --- [~hyukjin.kwon]: Thank you for your response! Setting {{spark.sql.storeAssignmentPolicy}} to LEGACY works. However, I believe it could get non-trivial for users to discover that {{spark.sql.storeAssignmentPolicy}} would work. For instance, after inspecting the code, I thought nullOnOverflow is controlled by \{{spark.sql.ansi.enabled.}} I tried to achieve the desired behavior by altering it (but to no avail). Could we add the usage of {{spark.sql.storeAssignmentPolicy}} to {{LEGACY}} to the error message? was (Author: JIRAUSER288838): [~hyukjin.kwon]: Thank you for your response! Setting {{spark.sql.storeAssignmentPolicy}} to LEGACY works. However, I believe it could get non-trivial for users to discover that {{spark.sql.storeAssignmentPolicy}} would work. For instance, after inspecting the code, I thought nullOnOverflow is controlled by {{spark.sql.ansi.enabled. }}I tried to achieve the desired behavior by altering it (but to no avail). Could we add the usage of {{spark.sql.storeAssignmentPolicy}} to {{LEGACY}} to the error message? > DECIMAL value with more precision than what is defined in the schema raises > exception in SparkSQL but evaluates to NULL for DataFrame > - > > Key: SPARK-40439 > URL: https://issues.apache.org/jira/browse/SPARK-40439 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.2.1 >Reporter: xsys >Priority: Major > > h3. Describe the bug > We are trying to store a DECIMAL value {{333.22}} with more > precision than what is defined in the schema: {{{}DECIMAL(20,10){}}}. This > leads to a {{NULL}} value being stored if the table is created using > DataFrames via {{{}spark-shell{}}}. However, it leads to the following > exception if the table is created via {{{}spark-sql{}}}: > {code:java} > Failed in [insert into decimal_extra_precision select 333.22] > java.lang.ArithmeticException: > Decimal(expanded,333.22,21,10}) cannot be represented as > Decimal(20, 10){code} > h3. Step to reproduce: > On Spark 3.2.1 (commit {{{}4f25b3f712{}}}), using {{{}spark-sql{}}}: > {code:java} > create table decimal_extra_precision(c1 DECIMAL(20,10)) STORED AS ORC; > insert into decimal_extra_precision select 333.22;{code} > Execute the following: > {code:java} > create table decimal_extra_precision(c1 DECIMAL(20,10)) STORED AS ORC; > insert into decimal_extra_precision select 333.22;{code} > h3. Expected behavior > We expect the two Spark interfaces ({{{}spark-sql{}}} & {{{}spark-shell{}}}) > to behave consistently for the same data type & input combination > ({{{}DECIMAL(20,10){}}} and {{{}333.22{}}}). > Here is a simplified example in {{{}spark-shell{}}}, where insertion of the > aforementioned decimal value evaluates to a {{{}NULL{}}}: > {code:java} > scala> import org.apache.spark.sql.{Row, SparkSession} > import org.apache.spark.sql.{Row, SparkSession} > scala> import org.apache.spark.sql.types._ > import org.apache.spark.sql.types._ > scala> val rdd = > sc.parallelize(Seq(Row(BigDecimal("333.22" > rdd: org.apache.spark.rdd.RDD[org.apache.spark.sql.Row] = > ParallelCollectionRDD[0] at parallelize at :27 > scala> val schema = new StructType().add(StructField("c1", DecimalType(20, > 10), true)) > schema: org.apache.spark.sql.types.StructType = > StructType(StructField(c1,DecimalType(20,10),true)) > scala> val df = spark.createDataFrame(rdd, schema) > df: org.apache.spark.sql.DataFrame = [c1: decimal(20,10)] > scala> df.show() > ++ > | c1| > ++ > |null| > ++ > scala> > df.write.mode("overwrite").format("orc").saveAsTable("decimal_extra_precision") > 22/08/29 10:33:47 WARN SessionState: METASTORE_FILTER_HOOK will be ignored, > since hive.security.authorization.manager is set to instance of > HiveAuthorizerFactory. > scala> spark.sql("select * from decimal_extra_precision;") > res2: org.apache.spark.sql.DataFrame = [c1: decimal(20,10)] > {code} > h3. Root Cause > The exception is being raised from > [Decimal|https://github.com/apache/spark/blob/v3.2.1/sql/catalyst/src/main/scala/org/apache/spark/sql/types/Decimal.scala#L358-L373] > ({{{}nullOnOverflow{}}} is controlled by {{spark.sql.ansi.enabled}} in > [SQLConf|https://github.com/apache/spark/blob/v3.2.1/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L2542-L2551].): > {code:java} > private[sql] def toPrecision( > precision: Int, > scale:
[jira] [Comment Edited] (SPARK-40439) DECIMAL value with more precision than what is defined in the schema raises exception in SparkSQL but evaluates to NULL for DataFrame
[ https://issues.apache.org/jira/browse/SPARK-40439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607314#comment-17607314 ] xsys edited comment on SPARK-40439 at 9/20/22 5:22 PM: --- [~hyukjin.kwon]: Thank you for your response! Setting {{spark.sql.storeAssignmentPolicy}} to LEGACY works. However, I believe it could get non-trivial for users to discover that {{spark.sql.storeAssignmentPolicy}} would work. For instance, after inspecting the code, I thought that nullOnOverflow is controlled by {{spark.sql.ansi.enabled. I}} tried to achieve the desired behaviour by altering it (but to no avail). Could we add the usage of {{spark.sql.storeAssignmentPolicy}} to {{LEGACY}} to the error message? was (Author: JIRAUSER288838): [~hyukjin.kwon]: Thank you for your response! Setting {{spark.sql.storeAssignmentPolicy}} to LEGACY works. However, I believe it could get non-trivial for users to discover that {{spark.sql.storeAssignmentPolicy }}would work. For instance, after inspecting the code, I thought that nullOnOverflow is controlled by {{spark.sql.ansi.enabled. I}} tried to achieve the desired behaviour by altering it (but to no avail). Could we add the usage of {{spark.sql.storeAssignmentPolicy}} to {{LEGACY}} to the error message? > DECIMAL value with more precision than what is defined in the schema raises > exception in SparkSQL but evaluates to NULL for DataFrame > - > > Key: SPARK-40439 > URL: https://issues.apache.org/jira/browse/SPARK-40439 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.2.1 >Reporter: xsys >Priority: Major > > h3. Describe the bug > We are trying to store a DECIMAL value {{333.22}} with more > precision than what is defined in the schema: {{{}DECIMAL(20,10){}}}. This > leads to a {{NULL}} value being stored if the table is created using > DataFrames via {{{}spark-shell{}}}. However, it leads to the following > exception if the table is created via {{{}spark-sql{}}}: > {code:java} > Failed in [insert into decimal_extra_precision select 333.22] > java.lang.ArithmeticException: > Decimal(expanded,333.22,21,10}) cannot be represented as > Decimal(20, 10){code} > h3. Step to reproduce: > On Spark 3.2.1 (commit {{{}4f25b3f712{}}}), using {{{}spark-sql{}}}: > {code:java} > create table decimal_extra_precision(c1 DECIMAL(20,10)) STORED AS ORC; > insert into decimal_extra_precision select 333.22;{code} > Execute the following: > {code:java} > create table decimal_extra_precision(c1 DECIMAL(20,10)) STORED AS ORC; > insert into decimal_extra_precision select 333.22;{code} > h3. Expected behavior > We expect the two Spark interfaces ({{{}spark-sql{}}} & {{{}spark-shell{}}}) > to behave consistently for the same data type & input combination > ({{{}DECIMAL(20,10){}}} and {{{}333.22{}}}). > Here is a simplified example in {{{}spark-shell{}}}, where insertion of the > aforementioned decimal value evaluates to a {{{}NULL{}}}: > {code:java} > scala> import org.apache.spark.sql.{Row, SparkSession} > import org.apache.spark.sql.{Row, SparkSession} > scala> import org.apache.spark.sql.types._ > import org.apache.spark.sql.types._ > scala> val rdd = > sc.parallelize(Seq(Row(BigDecimal("333.22" > rdd: org.apache.spark.rdd.RDD[org.apache.spark.sql.Row] = > ParallelCollectionRDD[0] at parallelize at :27 > scala> val schema = new StructType().add(StructField("c1", DecimalType(20, > 10), true)) > schema: org.apache.spark.sql.types.StructType = > StructType(StructField(c1,DecimalType(20,10),true)) > scala> val df = spark.createDataFrame(rdd, schema) > df: org.apache.spark.sql.DataFrame = [c1: decimal(20,10)] > scala> df.show() > ++ > | c1| > ++ > |null| > ++ > scala> > df.write.mode("overwrite").format("orc").saveAsTable("decimal_extra_precision") > 22/08/29 10:33:47 WARN SessionState: METASTORE_FILTER_HOOK will be ignored, > since hive.security.authorization.manager is set to instance of > HiveAuthorizerFactory. > scala> spark.sql("select * from decimal_extra_precision;") > res2: org.apache.spark.sql.DataFrame = [c1: decimal(20,10)] > {code} > h3. Root Cause > The exception is being raised from > [Decimal|https://github.com/apache/spark/blob/v3.2.1/sql/catalyst/src/main/scala/org/apache/spark/sql/types/Decimal.scala#L358-L373] > ({{{}nullOnOverflow{}}} is controlled by {{spark.sql.ansi.enabled}} in > [SQLConf|https://github.com/apache/spark/blob/v3.2.1/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L2542-L2551].): > {code:java} > private[sql] def toPrecision( > precision: Int, >
[jira] [Comment Edited] (SPARK-40439) DECIMAL value with more precision than what is defined in the schema raises exception in SparkSQL but evaluates to NULL for DataFrame
[ https://issues.apache.org/jira/browse/SPARK-40439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607314#comment-17607314 ] xsys edited comment on SPARK-40439 at 9/20/22 5:22 PM: --- [~hyukjin.kwon]: Thank you for your response! Setting {{spark.sql.storeAssignmentPolicy}} to LEGACY works. However, I believe it could get non-trivial for users to discover that {{spark.sql.storeAssignmentPolicy }}would work. For instance, after inspecting the code, I thought that nullOnOverflow is controlled by {{spark.sql.ansi.enabled. I}} tried to achieve the desired behaviour by altering it (but to no avail). Could we add the usage of {{spark.sql.storeAssignmentPolicy}} to {{LEGACY}} to the error message? was (Author: JIRAUSER288838): [~hyukjin.kwon]: Thank you for your response! Setting {{spark.sql.storeAssignmentPolicy}} to LEGACY works. However, I believe it could get non-trivial for users to discover that {{spark.sql.storeAssignmentPolicy would work.}} For instance, after inspecting the code, I thought that nullOnOverflow is controlled by {{spark.sql.ansi.enabled. I}} tried to achieve the desired behaviour by altering it (but to no avail). Could we add the usage of {{spark.sql.storeAssignmentPolicy}} to {{LEGACY}} to the error message? > DECIMAL value with more precision than what is defined in the schema raises > exception in SparkSQL but evaluates to NULL for DataFrame > - > > Key: SPARK-40439 > URL: https://issues.apache.org/jira/browse/SPARK-40439 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.2.1 >Reporter: xsys >Priority: Major > > h3. Describe the bug > We are trying to store a DECIMAL value {{333.22}} with more > precision than what is defined in the schema: {{{}DECIMAL(20,10){}}}. This > leads to a {{NULL}} value being stored if the table is created using > DataFrames via {{{}spark-shell{}}}. However, it leads to the following > exception if the table is created via {{{}spark-sql{}}}: > {code:java} > Failed in [insert into decimal_extra_precision select 333.22] > java.lang.ArithmeticException: > Decimal(expanded,333.22,21,10}) cannot be represented as > Decimal(20, 10){code} > h3. Step to reproduce: > On Spark 3.2.1 (commit {{{}4f25b3f712{}}}), using {{{}spark-sql{}}}: > {code:java} > create table decimal_extra_precision(c1 DECIMAL(20,10)) STORED AS ORC; > insert into decimal_extra_precision select 333.22;{code} > Execute the following: > {code:java} > create table decimal_extra_precision(c1 DECIMAL(20,10)) STORED AS ORC; > insert into decimal_extra_precision select 333.22;{code} > h3. Expected behavior > We expect the two Spark interfaces ({{{}spark-sql{}}} & {{{}spark-shell{}}}) > to behave consistently for the same data type & input combination > ({{{}DECIMAL(20,10){}}} and {{{}333.22{}}}). > Here is a simplified example in {{{}spark-shell{}}}, where insertion of the > aforementioned decimal value evaluates to a {{{}NULL{}}}: > {code:java} > scala> import org.apache.spark.sql.{Row, SparkSession} > import org.apache.spark.sql.{Row, SparkSession} > scala> import org.apache.spark.sql.types._ > import org.apache.spark.sql.types._ > scala> val rdd = > sc.parallelize(Seq(Row(BigDecimal("333.22" > rdd: org.apache.spark.rdd.RDD[org.apache.spark.sql.Row] = > ParallelCollectionRDD[0] at parallelize at :27 > scala> val schema = new StructType().add(StructField("c1", DecimalType(20, > 10), true)) > schema: org.apache.spark.sql.types.StructType = > StructType(StructField(c1,DecimalType(20,10),true)) > scala> val df = spark.createDataFrame(rdd, schema) > df: org.apache.spark.sql.DataFrame = [c1: decimal(20,10)] > scala> df.show() > ++ > | c1| > ++ > |null| > ++ > scala> > df.write.mode("overwrite").format("orc").saveAsTable("decimal_extra_precision") > 22/08/29 10:33:47 WARN SessionState: METASTORE_FILTER_HOOK will be ignored, > since hive.security.authorization.manager is set to instance of > HiveAuthorizerFactory. > scala> spark.sql("select * from decimal_extra_precision;") > res2: org.apache.spark.sql.DataFrame = [c1: decimal(20,10)] > {code} > h3. Root Cause > The exception is being raised from > [Decimal|https://github.com/apache/spark/blob/v3.2.1/sql/catalyst/src/main/scala/org/apache/spark/sql/types/Decimal.scala#L358-L373] > ({{{}nullOnOverflow{}}} is controlled by {{spark.sql.ansi.enabled}} in > [SQLConf|https://github.com/apache/spark/blob/v3.2.1/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L2542-L2551].): > {code:java} > private[sql] def toPrecision( > precision: Int, >
[jira] [Comment Edited] (SPARK-40439) DECIMAL value with more precision than what is defined in the schema raises exception in SparkSQL but evaluates to NULL for DataFrame
[ https://issues.apache.org/jira/browse/SPARK-40439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607314#comment-17607314 ] xsys edited comment on SPARK-40439 at 9/20/22 5:21 PM: --- [~hyukjin.kwon]: Thank you for your response! Setting {{spark.sql.storeAssignmentPolicy}} to LEGACY works. However, I believe it could get non-trivial for users to discover that {{spark.sql.storeAssignmentPolicy would work.}} {{ For instance, after inspecting the code, I thought that nullOnOverflow}} is controlled by {{spark.sql.ansi.enabled. I}} tried to achieve the desired behaviour by altering it (but to no avail). Could we add the usage of {{spark.sql.storeAssignmentPolicy}} to {{LEGACY}} to the error message? was (Author: JIRAUSER288838): [~hyukjin.kwon]: Thank you for your response! Setting {{spark.sql.storeAssignmentPolicy}} to LEGACY works. I believe it could get non-trivial for users to discover that {{spark.sql.storeAssignmentPolicy would work. For instance, after inspecting the code, I thought that nullOnOverflow}} is controlled by {{spark.sql.ansi.enabled. I}} tried to achieve the desired behaviour by altering it (but to no avail). Could we add the usage of {{spark.sql.storeAssignmentPolicy}} to {{LEGACY}} to the error message? > DECIMAL value with more precision than what is defined in the schema raises > exception in SparkSQL but evaluates to NULL for DataFrame > - > > Key: SPARK-40439 > URL: https://issues.apache.org/jira/browse/SPARK-40439 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.2.1 >Reporter: xsys >Priority: Major > > h3. Describe the bug > We are trying to store a DECIMAL value {{333.22}} with more > precision than what is defined in the schema: {{{}DECIMAL(20,10){}}}. This > leads to a {{NULL}} value being stored if the table is created using > DataFrames via {{{}spark-shell{}}}. However, it leads to the following > exception if the table is created via {{{}spark-sql{}}}: > {code:java} > Failed in [insert into decimal_extra_precision select 333.22] > java.lang.ArithmeticException: > Decimal(expanded,333.22,21,10}) cannot be represented as > Decimal(20, 10){code} > h3. Step to reproduce: > On Spark 3.2.1 (commit {{{}4f25b3f712{}}}), using {{{}spark-sql{}}}: > {code:java} > create table decimal_extra_precision(c1 DECIMAL(20,10)) STORED AS ORC; > insert into decimal_extra_precision select 333.22;{code} > Execute the following: > {code:java} > create table decimal_extra_precision(c1 DECIMAL(20,10)) STORED AS ORC; > insert into decimal_extra_precision select 333.22;{code} > h3. Expected behavior > We expect the two Spark interfaces ({{{}spark-sql{}}} & {{{}spark-shell{}}}) > to behave consistently for the same data type & input combination > ({{{}DECIMAL(20,10){}}} and {{{}333.22{}}}). > Here is a simplified example in {{{}spark-shell{}}}, where insertion of the > aforementioned decimal value evaluates to a {{{}NULL{}}}: > {code:java} > scala> import org.apache.spark.sql.{Row, SparkSession} > import org.apache.spark.sql.{Row, SparkSession} > scala> import org.apache.spark.sql.types._ > import org.apache.spark.sql.types._ > scala> val rdd = > sc.parallelize(Seq(Row(BigDecimal("333.22" > rdd: org.apache.spark.rdd.RDD[org.apache.spark.sql.Row] = > ParallelCollectionRDD[0] at parallelize at :27 > scala> val schema = new StructType().add(StructField("c1", DecimalType(20, > 10), true)) > schema: org.apache.spark.sql.types.StructType = > StructType(StructField(c1,DecimalType(20,10),true)) > scala> val df = spark.createDataFrame(rdd, schema) > df: org.apache.spark.sql.DataFrame = [c1: decimal(20,10)] > scala> df.show() > ++ > | c1| > ++ > |null| > ++ > scala> > df.write.mode("overwrite").format("orc").saveAsTable("decimal_extra_precision") > 22/08/29 10:33:47 WARN SessionState: METASTORE_FILTER_HOOK will be ignored, > since hive.security.authorization.manager is set to instance of > HiveAuthorizerFactory. > scala> spark.sql("select * from decimal_extra_precision;") > res2: org.apache.spark.sql.DataFrame = [c1: decimal(20,10)] > {code} > h3. Root Cause > The exception is being raised from > [Decimal|https://github.com/apache/spark/blob/v3.2.1/sql/catalyst/src/main/scala/org/apache/spark/sql/types/Decimal.scala#L358-L373] > ({{{}nullOnOverflow{}}} is controlled by {{spark.sql.ansi.enabled}} in > [SQLConf|https://github.com/apache/spark/blob/v3.2.1/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L2542-L2551].): > {code:java} > private[sql] def toPrecision( > precision: Int, >
[jira] [Comment Edited] (SPARK-40439) DECIMAL value with more precision than what is defined in the schema raises exception in SparkSQL but evaluates to NULL for DataFrame
[ https://issues.apache.org/jira/browse/SPARK-40439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607314#comment-17607314 ] xsys edited comment on SPARK-40439 at 9/20/22 5:21 PM: --- [~hyukjin.kwon]: Thank you for your response! Setting {{spark.sql.storeAssignmentPolicy}} to LEGACY works. However, I believe it could get non-trivial for users to discover that {{spark.sql.storeAssignmentPolicy would work.}} For instance, after inspecting the code, I thought that nullOnOverflow is controlled by {{spark.sql.ansi.enabled. I}} tried to achieve the desired behaviour by altering it (but to no avail). Could we add the usage of {{spark.sql.storeAssignmentPolicy}} to {{LEGACY}} to the error message? was (Author: JIRAUSER288838): [~hyukjin.kwon]: Thank you for your response! Setting {{spark.sql.storeAssignmentPolicy}} to LEGACY works. However, I believe it could get non-trivial for users to discover that {{spark.sql.storeAssignmentPolicy would work.}} {{ For instance, after inspecting the code, I thought that nullOnOverflow}} is controlled by {{spark.sql.ansi.enabled. I}} tried to achieve the desired behaviour by altering it (but to no avail). Could we add the usage of {{spark.sql.storeAssignmentPolicy}} to {{LEGACY}} to the error message? > DECIMAL value with more precision than what is defined in the schema raises > exception in SparkSQL but evaluates to NULL for DataFrame > - > > Key: SPARK-40439 > URL: https://issues.apache.org/jira/browse/SPARK-40439 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.2.1 >Reporter: xsys >Priority: Major > > h3. Describe the bug > We are trying to store a DECIMAL value {{333.22}} with more > precision than what is defined in the schema: {{{}DECIMAL(20,10){}}}. This > leads to a {{NULL}} value being stored if the table is created using > DataFrames via {{{}spark-shell{}}}. However, it leads to the following > exception if the table is created via {{{}spark-sql{}}}: > {code:java} > Failed in [insert into decimal_extra_precision select 333.22] > java.lang.ArithmeticException: > Decimal(expanded,333.22,21,10}) cannot be represented as > Decimal(20, 10){code} > h3. Step to reproduce: > On Spark 3.2.1 (commit {{{}4f25b3f712{}}}), using {{{}spark-sql{}}}: > {code:java} > create table decimal_extra_precision(c1 DECIMAL(20,10)) STORED AS ORC; > insert into decimal_extra_precision select 333.22;{code} > Execute the following: > {code:java} > create table decimal_extra_precision(c1 DECIMAL(20,10)) STORED AS ORC; > insert into decimal_extra_precision select 333.22;{code} > h3. Expected behavior > We expect the two Spark interfaces ({{{}spark-sql{}}} & {{{}spark-shell{}}}) > to behave consistently for the same data type & input combination > ({{{}DECIMAL(20,10){}}} and {{{}333.22{}}}). > Here is a simplified example in {{{}spark-shell{}}}, where insertion of the > aforementioned decimal value evaluates to a {{{}NULL{}}}: > {code:java} > scala> import org.apache.spark.sql.{Row, SparkSession} > import org.apache.spark.sql.{Row, SparkSession} > scala> import org.apache.spark.sql.types._ > import org.apache.spark.sql.types._ > scala> val rdd = > sc.parallelize(Seq(Row(BigDecimal("333.22" > rdd: org.apache.spark.rdd.RDD[org.apache.spark.sql.Row] = > ParallelCollectionRDD[0] at parallelize at :27 > scala> val schema = new StructType().add(StructField("c1", DecimalType(20, > 10), true)) > schema: org.apache.spark.sql.types.StructType = > StructType(StructField(c1,DecimalType(20,10),true)) > scala> val df = spark.createDataFrame(rdd, schema) > df: org.apache.spark.sql.DataFrame = [c1: decimal(20,10)] > scala> df.show() > ++ > | c1| > ++ > |null| > ++ > scala> > df.write.mode("overwrite").format("orc").saveAsTable("decimal_extra_precision") > 22/08/29 10:33:47 WARN SessionState: METASTORE_FILTER_HOOK will be ignored, > since hive.security.authorization.manager is set to instance of > HiveAuthorizerFactory. > scala> spark.sql("select * from decimal_extra_precision;") > res2: org.apache.spark.sql.DataFrame = [c1: decimal(20,10)] > {code} > h3. Root Cause > The exception is being raised from > [Decimal|https://github.com/apache/spark/blob/v3.2.1/sql/catalyst/src/main/scala/org/apache/spark/sql/types/Decimal.scala#L358-L373] > ({{{}nullOnOverflow{}}} is controlled by {{spark.sql.ansi.enabled}} in > [SQLConf|https://github.com/apache/spark/blob/v3.2.1/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L2542-L2551].): > {code:java} > private[sql] def toPrecision( > precision: Int, >
[jira] [Comment Edited] (SPARK-40439) DECIMAL value with more precision than what is defined in the schema raises exception in SparkSQL but evaluates to NULL for DataFrame
[ https://issues.apache.org/jira/browse/SPARK-40439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607314#comment-17607314 ] xsys edited comment on SPARK-40439 at 9/20/22 5:20 PM: --- [~hyukjin.kwon]: Thank you for your response! Setting {{spark.sql.storeAssignmentPolicy}} to LEGACY works. I believe it could get non-trivial for users to discover that {{spark.sql.storeAssignmentPolicy would work. For instance, after inspecting the code, I thought that nullOnOverflow}} is controlled by {{spark.sql.ansi.enabled. I}} tried to achieve the desired behaviour by altering it (but to no avail). Could we add the usage of {{spark.sql.storeAssignmentPolicy}} to {{LEGACY}} to the error message? was (Author: JIRAUSER288838): [~hyukjin.kwon]: Thank you for your response! Setting {{spark.sql.storeAssignmentPolicy}} to {{LEGACY works. }} I believe it could get non trivial for users to discover that {{spark.sql.storeAssignmentPolicy }}would work. For instance, after inspecting the code, I thought that {{nullOnOverflow}} is controlled by {{spark.sql.ansi.enabled and}} I tried to achieve the desired behaviour by altering it (but to no avail). Could we add the usage of {{spark.sql.storeAssignmentPolicy}} to {{LEGACY}} to the error message? > DECIMAL value with more precision than what is defined in the schema raises > exception in SparkSQL but evaluates to NULL for DataFrame > - > > Key: SPARK-40439 > URL: https://issues.apache.org/jira/browse/SPARK-40439 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.2.1 >Reporter: xsys >Priority: Major > > h3. Describe the bug > We are trying to store a DECIMAL value {{333.22}} with more > precision than what is defined in the schema: {{{}DECIMAL(20,10){}}}. This > leads to a {{NULL}} value being stored if the table is created using > DataFrames via {{{}spark-shell{}}}. However, it leads to the following > exception if the table is created via {{{}spark-sql{}}}: > {code:java} > Failed in [insert into decimal_extra_precision select 333.22] > java.lang.ArithmeticException: > Decimal(expanded,333.22,21,10}) cannot be represented as > Decimal(20, 10){code} > h3. Step to reproduce: > On Spark 3.2.1 (commit {{{}4f25b3f712{}}}), using {{{}spark-sql{}}}: > {code:java} > create table decimal_extra_precision(c1 DECIMAL(20,10)) STORED AS ORC; > insert into decimal_extra_precision select 333.22;{code} > Execute the following: > {code:java} > create table decimal_extra_precision(c1 DECIMAL(20,10)) STORED AS ORC; > insert into decimal_extra_precision select 333.22;{code} > h3. Expected behavior > We expect the two Spark interfaces ({{{}spark-sql{}}} & {{{}spark-shell{}}}) > to behave consistently for the same data type & input combination > ({{{}DECIMAL(20,10){}}} and {{{}333.22{}}}). > Here is a simplified example in {{{}spark-shell{}}}, where insertion of the > aforementioned decimal value evaluates to a {{{}NULL{}}}: > {code:java} > scala> import org.apache.spark.sql.{Row, SparkSession} > import org.apache.spark.sql.{Row, SparkSession} > scala> import org.apache.spark.sql.types._ > import org.apache.spark.sql.types._ > scala> val rdd = > sc.parallelize(Seq(Row(BigDecimal("333.22" > rdd: org.apache.spark.rdd.RDD[org.apache.spark.sql.Row] = > ParallelCollectionRDD[0] at parallelize at :27 > scala> val schema = new StructType().add(StructField("c1", DecimalType(20, > 10), true)) > schema: org.apache.spark.sql.types.StructType = > StructType(StructField(c1,DecimalType(20,10),true)) > scala> val df = spark.createDataFrame(rdd, schema) > df: org.apache.spark.sql.DataFrame = [c1: decimal(20,10)] > scala> df.show() > ++ > | c1| > ++ > |null| > ++ > scala> > df.write.mode("overwrite").format("orc").saveAsTable("decimal_extra_precision") > 22/08/29 10:33:47 WARN SessionState: METASTORE_FILTER_HOOK will be ignored, > since hive.security.authorization.manager is set to instance of > HiveAuthorizerFactory. > scala> spark.sql("select * from decimal_extra_precision;") > res2: org.apache.spark.sql.DataFrame = [c1: decimal(20,10)] > {code} > h3. Root Cause > The exception is being raised from > [Decimal|https://github.com/apache/spark/blob/v3.2.1/sql/catalyst/src/main/scala/org/apache/spark/sql/types/Decimal.scala#L358-L373] > ({{{}nullOnOverflow{}}} is controlled by {{spark.sql.ansi.enabled}} in > [SQLConf|https://github.com/apache/spark/blob/v3.2.1/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L2542-L2551].): > {code:java} > private[sql] def toPrecision( > precision: Int, >
[jira] [Comment Edited] (SPARK-40439) DECIMAL value with more precision than what is defined in the schema raises exception in SparkSQL but evaluates to NULL for DataFrame
[ https://issues.apache.org/jira/browse/SPARK-40439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607314#comment-17607314 ] xsys edited comment on SPARK-40439 at 9/20/22 5:20 PM: --- [~hyukjin.kwon]: Thank you for your response! Setting {{spark.sql.storeAssignmentPolicy}} to {{LEGACY works. }} I believe it could get non trivial for users to discover that {{spark.sql.storeAssignmentPolicy }}would work. For instance, after inspecting the code, I thought that {{nullOnOverflow}} is controlled by {{spark.sql.ansi.enabled and}} I tried to achieve the desired behaviour by altering it (but to no avail). Could we add the usage of {{spark.sql.storeAssignmentPolicy}} to {{LEGACY}} to the error message? was (Author: JIRAUSER288838): [~hyukjin.kwon]: Thank you for your response! Setting {{spark.sql.storeAssignmentPolicy}} to {{LEGACY works. }}I believe it could get non-trivial for users to discover that {{spark.sql.storeAssignmentPolicy}} would work. For instance, after inspecting the code, I thought that {{nullOnOverflow}} is controlled by {{spark.sql.ansi.enabled and}} I tried to achieve the desired behaviour by altering it (but to no avail). Could we add the usage of {{spark.sql.storeAssignmentPolicy}} to {{LEGACY}} to the error message? > DECIMAL value with more precision than what is defined in the schema raises > exception in SparkSQL but evaluates to NULL for DataFrame > - > > Key: SPARK-40439 > URL: https://issues.apache.org/jira/browse/SPARK-40439 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.2.1 >Reporter: xsys >Priority: Major > > h3. Describe the bug > We are trying to store a DECIMAL value {{333.22}} with more > precision than what is defined in the schema: {{{}DECIMAL(20,10){}}}. This > leads to a {{NULL}} value being stored if the table is created using > DataFrames via {{{}spark-shell{}}}. However, it leads to the following > exception if the table is created via {{{}spark-sql{}}}: > {code:java} > Failed in [insert into decimal_extra_precision select 333.22] > java.lang.ArithmeticException: > Decimal(expanded,333.22,21,10}) cannot be represented as > Decimal(20, 10){code} > h3. Step to reproduce: > On Spark 3.2.1 (commit {{{}4f25b3f712{}}}), using {{{}spark-sql{}}}: > {code:java} > create table decimal_extra_precision(c1 DECIMAL(20,10)) STORED AS ORC; > insert into decimal_extra_precision select 333.22;{code} > Execute the following: > {code:java} > create table decimal_extra_precision(c1 DECIMAL(20,10)) STORED AS ORC; > insert into decimal_extra_precision select 333.22;{code} > h3. Expected behavior > We expect the two Spark interfaces ({{{}spark-sql{}}} & {{{}spark-shell{}}}) > to behave consistently for the same data type & input combination > ({{{}DECIMAL(20,10){}}} and {{{}333.22{}}}). > Here is a simplified example in {{{}spark-shell{}}}, where insertion of the > aforementioned decimal value evaluates to a {{{}NULL{}}}: > {code:java} > scala> import org.apache.spark.sql.{Row, SparkSession} > import org.apache.spark.sql.{Row, SparkSession} > scala> import org.apache.spark.sql.types._ > import org.apache.spark.sql.types._ > scala> val rdd = > sc.parallelize(Seq(Row(BigDecimal("333.22" > rdd: org.apache.spark.rdd.RDD[org.apache.spark.sql.Row] = > ParallelCollectionRDD[0] at parallelize at :27 > scala> val schema = new StructType().add(StructField("c1", DecimalType(20, > 10), true)) > schema: org.apache.spark.sql.types.StructType = > StructType(StructField(c1,DecimalType(20,10),true)) > scala> val df = spark.createDataFrame(rdd, schema) > df: org.apache.spark.sql.DataFrame = [c1: decimal(20,10)] > scala> df.show() > ++ > | c1| > ++ > |null| > ++ > scala> > df.write.mode("overwrite").format("orc").saveAsTable("decimal_extra_precision") > 22/08/29 10:33:47 WARN SessionState: METASTORE_FILTER_HOOK will be ignored, > since hive.security.authorization.manager is set to instance of > HiveAuthorizerFactory. > scala> spark.sql("select * from decimal_extra_precision;") > res2: org.apache.spark.sql.DataFrame = [c1: decimal(20,10)] > {code} > h3. Root Cause > The exception is being raised from > [Decimal|https://github.com/apache/spark/blob/v3.2.1/sql/catalyst/src/main/scala/org/apache/spark/sql/types/Decimal.scala#L358-L373] > ({{{}nullOnOverflow{}}} is controlled by {{spark.sql.ansi.enabled}} in > [SQLConf|https://github.com/apache/spark/blob/v3.2.1/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L2542-L2551].): > {code:java} > private[sql] def toPrecision( > precision: Int, >
[jira] [Comment Edited] (SPARK-40439) DECIMAL value with more precision than what is defined in the schema raises exception in SparkSQL but evaluates to NULL for DataFrame
[ https://issues.apache.org/jira/browse/SPARK-40439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607314#comment-17607314 ] xsys edited comment on SPARK-40439 at 9/20/22 5:18 PM: --- [~hyukjin.kwon]: Thank you for your response! Setting {{spark.sql.storeAssignmentPolicy}} to {{LEGACY works. }}I believe it could get non-trivial for users to discover that {{spark.sql.storeAssignmentPolicy}} would work. For instance, after inspecting the code, I thought that {{nullOnOverflow}} is controlled by {{spark.sql.ansi.enabled and}} I tried to achieve the desired behaviour by altering it (but to no avail). Could we add the usage of {{spark.sql.storeAssignmentPolicy}} to {{LEGACY}} to the error message? was (Author: JIRAUSER288838): [~hyukjin.kwon]: Thank you for your response! Setting {{spark.sql.storeAssignmentPolicy}} to {{LEGACY works. I believe it could get non-trivial for users to discover that spark.sql.storeAssignmentPolicy}} would work. For instance, after inspecting the code, I thought that {{nullOnOverflow}} is controlled by {{spark.sql.ansi.enabled and}} I tried to achieve the desired behaviour by altering it (but to no avail). Could we add the usage of {{spark.sql.storeAssignmentPolicy}} to {{LEGACY}} to the error message? > DECIMAL value with more precision than what is defined in the schema raises > exception in SparkSQL but evaluates to NULL for DataFrame > - > > Key: SPARK-40439 > URL: https://issues.apache.org/jira/browse/SPARK-40439 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.2.1 >Reporter: xsys >Priority: Major > > h3. Describe the bug > We are trying to store a DECIMAL value {{333.22}} with more > precision than what is defined in the schema: {{{}DECIMAL(20,10){}}}. This > leads to a {{NULL}} value being stored if the table is created using > DataFrames via {{{}spark-shell{}}}. However, it leads to the following > exception if the table is created via {{{}spark-sql{}}}: > {code:java} > Failed in [insert into decimal_extra_precision select 333.22] > java.lang.ArithmeticException: > Decimal(expanded,333.22,21,10}) cannot be represented as > Decimal(20, 10){code} > h3. Step to reproduce: > On Spark 3.2.1 (commit {{{}4f25b3f712{}}}), using {{{}spark-sql{}}}: > {code:java} > create table decimal_extra_precision(c1 DECIMAL(20,10)) STORED AS ORC; > insert into decimal_extra_precision select 333.22;{code} > Execute the following: > {code:java} > create table decimal_extra_precision(c1 DECIMAL(20,10)) STORED AS ORC; > insert into decimal_extra_precision select 333.22;{code} > h3. Expected behavior > We expect the two Spark interfaces ({{{}spark-sql{}}} & {{{}spark-shell{}}}) > to behave consistently for the same data type & input combination > ({{{}DECIMAL(20,10){}}} and {{{}333.22{}}}). > Here is a simplified example in {{{}spark-shell{}}}, where insertion of the > aforementioned decimal value evaluates to a {{{}NULL{}}}: > {code:java} > scala> import org.apache.spark.sql.{Row, SparkSession} > import org.apache.spark.sql.{Row, SparkSession} > scala> import org.apache.spark.sql.types._ > import org.apache.spark.sql.types._ > scala> val rdd = > sc.parallelize(Seq(Row(BigDecimal("333.22" > rdd: org.apache.spark.rdd.RDD[org.apache.spark.sql.Row] = > ParallelCollectionRDD[0] at parallelize at :27 > scala> val schema = new StructType().add(StructField("c1", DecimalType(20, > 10), true)) > schema: org.apache.spark.sql.types.StructType = > StructType(StructField(c1,DecimalType(20,10),true)) > scala> val df = spark.createDataFrame(rdd, schema) > df: org.apache.spark.sql.DataFrame = [c1: decimal(20,10)] > scala> df.show() > ++ > | c1| > ++ > |null| > ++ > scala> > df.write.mode("overwrite").format("orc").saveAsTable("decimal_extra_precision") > 22/08/29 10:33:47 WARN SessionState: METASTORE_FILTER_HOOK will be ignored, > since hive.security.authorization.manager is set to instance of > HiveAuthorizerFactory. > scala> spark.sql("select * from decimal_extra_precision;") > res2: org.apache.spark.sql.DataFrame = [c1: decimal(20,10)] > {code} > h3. Root Cause > The exception is being raised from > [Decimal|https://github.com/apache/spark/blob/v3.2.1/sql/catalyst/src/main/scala/org/apache/spark/sql/types/Decimal.scala#L358-L373] > ({{{}nullOnOverflow{}}} is controlled by {{spark.sql.ansi.enabled}} in > [SQLConf|https://github.com/apache/spark/blob/v3.2.1/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L2542-L2551].): > {code:java} > private[sql] def toPrecision( > precision: Int, >
[jira] [Comment Edited] (SPARK-40439) DECIMAL value with more precision than what is defined in the schema raises exception in SparkSQL but evaluates to NULL for DataFrame
[ https://issues.apache.org/jira/browse/SPARK-40439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607314#comment-17607314 ] xsys edited comment on SPARK-40439 at 9/20/22 5:17 PM: --- [~hyukjin.kwon]: Thank you for your response! Setting {{spark.sql.storeAssignmentPolicy}} to {{LEGACY works. I believe it could get non-trivial for users to discover that spark.sql.storeAssignmentPolicy}} would work. For instance, after inspecting the code, I thought that {{nullOnOverflow}} is controlled by {{spark.sql.ansi.enabled and}} I tried to achieve the desired behaviour by altering it (but to no avail). Could we add the usage of {{spark.sql.storeAssignmentPolicy}} to {{LEGACY}} to the error message? was (Author: JIRAUSER288838): [~hyukjin.kwon]: Thank you for your response! Setting {{spark.sql.storeAssignmentPolicy}} to {{LEGACY }}works. I believe it could get non-trivial for users to discover that {{spark.sql.storeAssignmentPolicy}} would work. For instance, after inspecting the code, I thought that {{nullOnOverflow}} is controlled by {{spark.sql.ansi.enabled and}} I tried to achieve the desired behaviour by altering it (but to no avail). Could we add the usage of {{spark.sql.storeAssignmentPolicy}} to {{LEGACY}} to the error message? > DECIMAL value with more precision than what is defined in the schema raises > exception in SparkSQL but evaluates to NULL for DataFrame > - > > Key: SPARK-40439 > URL: https://issues.apache.org/jira/browse/SPARK-40439 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.2.1 >Reporter: xsys >Priority: Major > > h3. Describe the bug > We are trying to store a DECIMAL value {{333.22}} with more > precision than what is defined in the schema: {{{}DECIMAL(20,10){}}}. This > leads to a {{NULL}} value being stored if the table is created using > DataFrames via {{{}spark-shell{}}}. However, it leads to the following > exception if the table is created via {{{}spark-sql{}}}: > {code:java} > Failed in [insert into decimal_extra_precision select 333.22] > java.lang.ArithmeticException: > Decimal(expanded,333.22,21,10}) cannot be represented as > Decimal(20, 10){code} > h3. Step to reproduce: > On Spark 3.2.1 (commit {{{}4f25b3f712{}}}), using {{{}spark-sql{}}}: > {code:java} > create table decimal_extra_precision(c1 DECIMAL(20,10)) STORED AS ORC; > insert into decimal_extra_precision select 333.22;{code} > Execute the following: > {code:java} > create table decimal_extra_precision(c1 DECIMAL(20,10)) STORED AS ORC; > insert into decimal_extra_precision select 333.22;{code} > h3. Expected behavior > We expect the two Spark interfaces ({{{}spark-sql{}}} & {{{}spark-shell{}}}) > to behave consistently for the same data type & input combination > ({{{}DECIMAL(20,10){}}} and {{{}333.22{}}}). > Here is a simplified example in {{{}spark-shell{}}}, where insertion of the > aforementioned decimal value evaluates to a {{{}NULL{}}}: > {code:java} > scala> import org.apache.spark.sql.{Row, SparkSession} > import org.apache.spark.sql.{Row, SparkSession} > scala> import org.apache.spark.sql.types._ > import org.apache.spark.sql.types._ > scala> val rdd = > sc.parallelize(Seq(Row(BigDecimal("333.22" > rdd: org.apache.spark.rdd.RDD[org.apache.spark.sql.Row] = > ParallelCollectionRDD[0] at parallelize at :27 > scala> val schema = new StructType().add(StructField("c1", DecimalType(20, > 10), true)) > schema: org.apache.spark.sql.types.StructType = > StructType(StructField(c1,DecimalType(20,10),true)) > scala> val df = spark.createDataFrame(rdd, schema) > df: org.apache.spark.sql.DataFrame = [c1: decimal(20,10)] > scala> df.show() > ++ > | c1| > ++ > |null| > ++ > scala> > df.write.mode("overwrite").format("orc").saveAsTable("decimal_extra_precision") > 22/08/29 10:33:47 WARN SessionState: METASTORE_FILTER_HOOK will be ignored, > since hive.security.authorization.manager is set to instance of > HiveAuthorizerFactory. > scala> spark.sql("select * from decimal_extra_precision;") > res2: org.apache.spark.sql.DataFrame = [c1: decimal(20,10)] > {code} > h3. Root Cause > The exception is being raised from > [Decimal|https://github.com/apache/spark/blob/v3.2.1/sql/catalyst/src/main/scala/org/apache/spark/sql/types/Decimal.scala#L358-L373] > ({{{}nullOnOverflow{}}} is controlled by {{spark.sql.ansi.enabled}} in > [SQLConf|https://github.com/apache/spark/blob/v3.2.1/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L2542-L2551].): > {code:java} > private[sql] def toPrecision( > precision: Int, >