[jira] [Commented] (SPARK-22036) BigDecimal multiplication sometimes returns null
[ https://issues.apache.org/jira/browse/SPARK-22036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622877#comment-16622877 ] Apache Spark commented on SPARK-22036: -- User 'cloud-fan' has created a pull request for this issue: https://github.com/apache/spark/pull/22494 > BigDecimal multiplication sometimes returns null > > > Key: SPARK-22036 > URL: https://issues.apache.org/jira/browse/SPARK-22036 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0 >Reporter: Olivier Blanvillain >Assignee: Marco Gaido >Priority: Major > Fix For: 2.3.0 > > > The multiplication of two BigDecimal numbers sometimes returns null. Here is > a minimal reproduction: > {code:java} > object Main extends App { > import org.apache.spark.{SparkConf, SparkContext} > import org.apache.spark.sql.SparkSession > import spark.implicits._ > val conf = new > SparkConf().setMaster("local[*]").setAppName("REPL").set("spark.ui.enabled", > "false") > val spark = > SparkSession.builder().config(conf).appName("REPL").getOrCreate() > implicit val sqlContext = spark.sqlContext > case class X2(a: BigDecimal, b: BigDecimal) > val ds = sqlContext.createDataset(List(X2(BigDecimal(-0.1267333984375), > BigDecimal(-1000.1 > val result = ds.select(ds("a") * ds("b")).collect.head > println(result) // [null] > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22036) BigDecimal multiplication sometimes returns null
[ https://issues.apache.org/jira/browse/SPARK-22036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16619129#comment-16619129 ] Marco Gaido commented on SPARK-22036: - [~bersprockets] I created SPARK-25454 for tracking since I have a path for this and it might be considered as a blocker for 2.4, so I wanted to expedite it. I am submitting a patch for this soon. Sorry for the problem again. Thanks. > BigDecimal multiplication sometimes returns null > > > Key: SPARK-22036 > URL: https://issues.apache.org/jira/browse/SPARK-22036 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0 >Reporter: Olivier Blanvillain >Assignee: Marco Gaido >Priority: Major > Fix For: 2.3.0 > > > The multiplication of two BigDecimal numbers sometimes returns null. Here is > a minimal reproduction: > {code:java} > object Main extends App { > import org.apache.spark.{SparkConf, SparkContext} > import org.apache.spark.sql.SparkSession > import spark.implicits._ > val conf = new > SparkConf().setMaster("local[*]").setAppName("REPL").set("spark.ui.enabled", > "false") > val spark = > SparkSession.builder().config(conf).appName("REPL").getOrCreate() > implicit val sqlContext = spark.sqlContext > case class X2(a: BigDecimal, b: BigDecimal) > val ds = sqlContext.createDataset(List(X2(BigDecimal(-0.1267333984375), > BigDecimal(-1000.1 > val result = ds.select(ds("a") * ds("b")).collect.head > println(result) // [null] > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22036) BigDecimal multiplication sometimes returns null
[ https://issues.apache.org/jira/browse/SPARK-22036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16618892#comment-16618892 ] Marco Gaido commented on SPARK-22036: - [~bersprockets] first of all thank you for reporting this and sorry for my mistake on this. I think the solution you are suggesting isn't the right one. Also the result in the case allowPrecisionLoss=true should not have any truncation here. The problem is the way we handle negative scale. So this issue I think is related to SPARK-24468. The problem is that Hive and MSSQL we are taking our rules from are not allowing negative scale, while we do. So this has to be revisited. May you please submit a new JIRA for this? Meanwhile I am starting working on it and I'll submit a fix ASAP. Sorry for the trouble. Thanks. > BigDecimal multiplication sometimes returns null > > > Key: SPARK-22036 > URL: https://issues.apache.org/jira/browse/SPARK-22036 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0 >Reporter: Olivier Blanvillain >Assignee: Marco Gaido >Priority: Major > Fix For: 2.3.0 > > > The multiplication of two BigDecimal numbers sometimes returns null. Here is > a minimal reproduction: > {code:java} > object Main extends App { > import org.apache.spark.{SparkConf, SparkContext} > import org.apache.spark.sql.SparkSession > import spark.implicits._ > val conf = new > SparkConf().setMaster("local[*]").setAppName("REPL").set("spark.ui.enabled", > "false") > val spark = > SparkSession.builder().config(conf).appName("REPL").getOrCreate() > implicit val sqlContext = spark.sqlContext > case class X2(a: BigDecimal, b: BigDecimal) > val ds = sqlContext.createDataset(List(X2(BigDecimal(-0.1267333984375), > BigDecimal(-1000.1 > val result = ds.select(ds("a") * ds("b")).collect.head > println(result) // [null] > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22036) BigDecimal multiplication sometimes returns null
[ https://issues.apache.org/jira/browse/SPARK-22036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16618104#comment-16618104 ] Bruce Robbins commented on SPARK-22036: --- [~mgaido] In this change, you modified how precision and scale are determined when literals are promoted to decimal. For example, before the change, an integer literal's precision and scale would be hardcoded to DecimalType(10, 0). After the change, it's based on the number of digits in the literal. However, that new behavior for literals is not toggled by {{spark.sql.decimalOperations.allowPrecisionLoss}} like the other changes in behavior introduced by the PR. As a result, there are cases where we see truncation and rounding in 2.3/2.4 that we don't see in 2.2, and this change in behavior is not controllable via the configuration setting. E.g,: In 2.2: {noformat} scala> sql("select 26393499451/(1e6 * 1000) as c1").printSchema root |-- c1: decimal(27,13) (nullable = true) <== 13 decimal digits scala> sql("select 26393499451/(1e6 * 1000) as c1").show ++ | c1| ++ |26.393499451| ++ {noformat} In 2.3 and up: {noformat} scala> sql("set spark.sql.decimalOperations.allowPrecisionLoss").show ++-+ | key|value| ++-+ |spark.sql.decimal...| true| ++-+ scala> sql("select 26393499451/(1e6 * 1000) as c1").printSchema root |-- c1: decimal(12,7) (nullable = true) scala> sql("select 26393499451/(1e6 * 1000) as c1").show +--+ |c1| +--+ |26.3934995| <== result is truncated and rounded up. +--+ scala> sql("set spark.sql.decimalOperations.allowPrecisionLoss=false").show ++-+ | key|value| ++-+ |spark.sql.decimal...|false| ++-+ scala> sql("select 26393499451/(1e6 * 1000) as c1").printSchema root |-- c1: decimal(12,7) (nullable = true) scala> sql("select 26393499451/(1e6 * 1000) as c1").show +--+ |c1| +--+ |26.3934995| <== result is still truncated and rounded up. +--+ scala> {noformat} I can force it to behave the old way, at least for this case, by explicitly casting the literal: {noformat} scala> sql("select 26393499451/(1e6 * cast(1000 as decimal(10, 0))) as c1").show ++ | c1| ++ |26.393499451| ++ {noformat} Do you think it makes sense for {{spark.sql.decimalOperations.allowPrecisionLoss}} to also toggle how literal promotion happens (the old way vs. the new way)? > BigDecimal multiplication sometimes returns null > > > Key: SPARK-22036 > URL: https://issues.apache.org/jira/browse/SPARK-22036 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0 >Reporter: Olivier Blanvillain >Assignee: Marco Gaido >Priority: Major > Fix For: 2.3.0 > > > The multiplication of two BigDecimal numbers sometimes returns null. Here is > a minimal reproduction: > {code:java} > object Main extends App { > import org.apache.spark.{SparkConf, SparkContext} > import org.apache.spark.sql.SparkSession > import spark.implicits._ > val conf = new > SparkConf().setMaster("local[*]").setAppName("REPL").set("spark.ui.enabled", > "false") > val spark = > SparkSession.builder().config(conf).appName("REPL").getOrCreate() > implicit val sqlContext = spark.sqlContext > case class X2(a: BigDecimal, b: BigDecimal) > val ds = sqlContext.createDataset(List(X2(BigDecimal(-0.1267333984375), > BigDecimal(-1000.1 > val result = ds.select(ds("a") * ds("b")).collect.head > println(result) // [null] > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22036) BigDecimal multiplication sometimes returns null
[ https://issues.apache.org/jira/browse/SPARK-22036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16351366#comment-16351366 ] Apache Spark commented on SPARK-22036: -- User 'wangyum' has created a pull request for this issue: https://github.com/apache/spark/pull/20498 > BigDecimal multiplication sometimes returns null > > > Key: SPARK-22036 > URL: https://issues.apache.org/jira/browse/SPARK-22036 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0 >Reporter: Olivier Blanvillain >Assignee: Marco Gaido >Priority: Major > Fix For: 2.3.0 > > > The multiplication of two BigDecimal numbers sometimes returns null. Here is > a minimal reproduction: > {code:java} > object Main extends App { > import org.apache.spark.{SparkConf, SparkContext} > import org.apache.spark.sql.SparkSession > import spark.implicits._ > val conf = new > SparkConf().setMaster("local[*]").setAppName("REPL").set("spark.ui.enabled", > "false") > val spark = > SparkSession.builder().config(conf).appName("REPL").getOrCreate() > implicit val sqlContext = spark.sqlContext > case class X2(a: BigDecimal, b: BigDecimal) > val ds = sqlContext.createDataset(List(X2(BigDecimal(-0.1267333984375), > BigDecimal(-1000.1 > val result = ds.select(ds("a") * ds("b")).collect.head > println(result) // [null] > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22036) BigDecimal multiplication sometimes returns null
[ https://issues.apache.org/jira/browse/SPARK-22036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16297056#comment-16297056 ] Apache Spark commented on SPARK-22036: -- User 'mgaido91' has created a pull request for this issue: https://github.com/apache/spark/pull/20023 > BigDecimal multiplication sometimes returns null > > > Key: SPARK-22036 > URL: https://issues.apache.org/jira/browse/SPARK-22036 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.2.0 >Reporter: Olivier Blanvillain > > The multiplication of two BigDecimal numbers sometimes returns null. Here is > a minimal reproduction: > {code:java} > object Main extends App { > import org.apache.spark.{SparkConf, SparkContext} > import org.apache.spark.sql.SparkSession > import spark.implicits._ > val conf = new > SparkConf().setMaster("local[*]").setAppName("REPL").set("spark.ui.enabled", > "false") > val spark = > SparkSession.builder().config(conf).appName("REPL").getOrCreate() > implicit val sqlContext = spark.sqlContext > case class X2(a: BigDecimal, b: BigDecimal) > val ds = sqlContext.createDataset(List(X2(BigDecimal(-0.1267333984375), > BigDecimal(-1000.1 > val result = ds.select(ds("a") * ds("b")).collect.head > println(result) // [null] > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22036) BigDecimal multiplication sometimes returns null
[ https://issues.apache.org/jira/browse/SPARK-22036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16291805#comment-16291805 ] Anvesh R commented on SPARK-22036: -- +1 Issue reproduced on spark-2.2.0 : Data at s3 location - s3://bucket/spark-sql-jira/ : - 100|9 drop table if exists test; CREATE EXTERNAL TABLE `test` ( adecimal(38,10), bdecimal(38,10) ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS TEXTFILE LOCATION 's3://bucket/spark-sql-jira/'; spark-sql> select a,(a*b*0.98765432100) from test; 100 9876444.4445679 Time taken: 11.033 seconds, Fetched 1 row(s) spark-sql> select a,(a*b*0.987654321000) from test; 100 NULL Time taken: 0.523 seconds, Fetched 1 row(s) Changing a column's scale from decimal(38,10) to decimal(38,9) also helped but we would loose precision. > BigDecimal multiplication sometimes returns null > > > Key: SPARK-22036 > URL: https://issues.apache.org/jira/browse/SPARK-22036 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.2.0 >Reporter: Olivier Blanvillain > > The multiplication of two BigDecimal numbers sometimes returns null. Here is > a minimal reproduction: > {code:java} > object Main extends App { > import org.apache.spark.{SparkConf, SparkContext} > import org.apache.spark.sql.SparkSession > import spark.implicits._ > val conf = new > SparkConf().setMaster("local[*]").setAppName("REPL").set("spark.ui.enabled", > "false") > val spark = > SparkSession.builder().config(conf).appName("REPL").getOrCreate() > implicit val sqlContext = spark.sqlContext > case class X2(a: BigDecimal, b: BigDecimal) > val ds = sqlContext.createDataset(List(X2(BigDecimal(-0.1267333984375), > BigDecimal(-1000.1 > val result = ds.select(ds("a") * ds("b")).collect.head > println(result) // [null] > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22036) BigDecimal multiplication sometimes returns null
[ https://issues.apache.org/jira/browse/SPARK-22036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16287025#comment-16287025 ] Tak Lon (Stephen) Wu commented on SPARK-22036: -- +1, we met the similar issue when multiplying 3+ BigDecimal, although we have a workaound to cast them into a smaller number, it's not easy to tell when it will fail especially we have more complicated query. > BigDecimal multiplication sometimes returns null > > > Key: SPARK-22036 > URL: https://issues.apache.org/jira/browse/SPARK-22036 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.2.0 >Reporter: Olivier Blanvillain > > The multiplication of two BigDecimal numbers sometimes returns null. Here is > a minimal reproduction: > {code:java} > object Main extends App { > import org.apache.spark.{SparkConf, SparkContext} > import org.apache.spark.sql.SparkSession > import spark.implicits._ > val conf = new > SparkConf().setMaster("local[*]").setAppName("REPL").set("spark.ui.enabled", > "false") > val spark = > SparkSession.builder().config(conf).appName("REPL").getOrCreate() > implicit val sqlContext = spark.sqlContext > case class X2(a: BigDecimal, b: BigDecimal) > val ds = sqlContext.createDataset(List(X2(BigDecimal(-0.1267333984375), > BigDecimal(-1000.1 > val result = ds.select(ds("a") * ds("b")).collect.head > println(result) // [null] > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22036) BigDecimal multiplication sometimes returns null
[ https://issues.apache.org/jira/browse/SPARK-22036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16169129#comment-16169129 ] Olivier Blanvillain commented on SPARK-22036: - Adding a cast indeeds prevents getting null values. However this solution is less than satisfactory as the resulting mutliplications are *less* precise on {{BigDecimal}} than on {{Double}}. Here is an example. We compute the product of two numbers: {{0.199735164642333984375}} and {{-0.10430812835693359375}}. Below are the result of this multiplication by the JVM as {{java.lang.Double}} (double), by Spark using {{.cast(DecimalType(20,14))}} (casted), and finally using unlimited precision arithmetic (actual): double: -2.083400119090584E-10 casted: -2.083400119509193464E-10 actual: -2.083400119090583757497370243072509765625E-10 > BigDecimal multiplication sometimes returns null > > > Key: SPARK-22036 > URL: https://issues.apache.org/jira/browse/SPARK-22036 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.2.0 >Reporter: Olivier Blanvillain > > The multiplication of two BigDecimal numbers sometimes returns null. This > issue we discovered while doing property based testing for the frameless > project. Here is a minimal reproduction: > {code:java} > object Main extends App { > import org.apache.spark.{SparkConf, SparkContext} > import org.apache.spark.sql.SparkSession > import spark.implicits._ > val conf = new > SparkConf().setMaster("local[*]").setAppName("REPL").set("spark.ui.enabled", > "false") > val spark = > SparkSession.builder().config(conf).appName("REPL").getOrCreate() > implicit val sqlContext = spark.sqlContext > case class X2(a: BigDecimal, b: BigDecimal) > val ds = sqlContext.createDataset(List(X2(BigDecimal(-0.1267333984375), > BigDecimal(-1000.1 > val result = ds.select(ds("a") * ds("b")).collect.head > println(result) // [null] > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22036) BigDecimal multiplication sometimes returns null
[ https://issues.apache.org/jira/browse/SPARK-22036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16169074#comment-16169074 ] Marco Gaido commented on SPARK-22036: - Maybe the "bad" part is that by default spark creates the columns as {{Decimal(38, 18)}}. This is the problem. With a multiplication this leads to a {{Decimal(38, 36)}}, which as you can easily understand is the root of the problem of your operation. If you cast the two columns before the multiplication, like {{ds("a").cast(DecimalType(20,14))}}, you won't have any problem anymore. Currently you should suggest Spark which are the right values to use. > BigDecimal multiplication sometimes returns null > > > Key: SPARK-22036 > URL: https://issues.apache.org/jira/browse/SPARK-22036 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.2.0 >Reporter: Olivier Blanvillain > > The multiplication of two BigDecimal numbers sometimes returns null. This > issue we discovered while doing property based testing for the frameless > project. Here is a minimal reproduction: > {code:java} > object Main extends App { > import org.apache.spark.{SparkConf, SparkContext} > import org.apache.spark.sql.SparkSession > import spark.implicits._ > val conf = new > SparkConf().setMaster("local[*]").setAppName("REPL").set("spark.ui.enabled", > "false") > val spark = > SparkSession.builder().config(conf).appName("REPL").getOrCreate() > implicit val sqlContext = spark.sqlContext > case class X2(a: BigDecimal, b: BigDecimal) > val ds = sqlContext.createDataset(List(X2(BigDecimal(-0.1267333984375), > BigDecimal(-1000.1 > val result = ds.select(ds("a") * ds("b")).collect.head > println(result) // [null] > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22036) BigDecimal multiplication sometimes returns null
[ https://issues.apache.org/jira/browse/SPARK-22036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16169059#comment-16169059 ] Marco Gaido commented on SPARK-22036: - Honestly I don't know, that is why I said that I don't know what should be done. > BigDecimal multiplication sometimes returns null > > > Key: SPARK-22036 > URL: https://issues.apache.org/jira/browse/SPARK-22036 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.2.0 >Reporter: Olivier Blanvillain > > The multiplication of two BigDecimal numbers sometimes returns null. This > issue we discovered while doing property based testing for the frameless > project. Here is a minimal reproduction: > {code:java} > object Main extends App { > import org.apache.spark.{SparkConf, SparkContext} > import org.apache.spark.sql.SparkSession > import spark.implicits._ > val conf = new > SparkConf().setMaster("local[*]").setAppName("REPL").set("spark.ui.enabled", > "false") > val spark = > SparkSession.builder().config(conf).appName("REPL").getOrCreate() > implicit val sqlContext = spark.sqlContext > case class X2(a: BigDecimal, b: BigDecimal) > val ds = sqlContext.createDataset(List(X2(BigDecimal(-0.1267333984375), > BigDecimal(-1000.1 > val result = ds.select(ds("a") * ds("b")).collect.head > println(result) // [null] > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22036) BigDecimal multiplication sometimes returns null
[ https://issues.apache.org/jira/browse/SPARK-22036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16169050#comment-16169050 ] Olivier Blanvillain commented on SPARK-22036: - I understand. Is this working as intended / fulfils a specification or is it just an artifact of the current implementation? > BigDecimal multiplication sometimes returns null > > > Key: SPARK-22036 > URL: https://issues.apache.org/jira/browse/SPARK-22036 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.2.0 >Reporter: Olivier Blanvillain > > The multiplication of two BigDecimal numbers sometimes returns null. This > issue we discovered while doing property based testing for the frameless > project. Here is a minimal reproduction: > {code:java} > object Main extends App { > import org.apache.spark.{SparkConf, SparkContext} > import org.apache.spark.sql.SparkSession > import spark.implicits._ > val conf = new > SparkConf().setMaster("local[*]").setAppName("REPL").set("spark.ui.enabled", > "false") > val spark = > SparkSession.builder().config(conf).appName("REPL").getOrCreate() > implicit val sqlContext = spark.sqlContext > case class X2(a: BigDecimal, b: BigDecimal) > val ds = sqlContext.createDataset(List(X2(BigDecimal(-0.1267333984375), > BigDecimal(-1000.1 > val result = ds.select(ds("a") * ds("b")).collect.head > println(result) // [null] > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22036) BigDecimal multiplication sometimes returns null
[ https://issues.apache.org/jira/browse/SPARK-22036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16169023#comment-16169023 ] Marco Gaido commented on SPARK-22036: - Yes, it is only for multiplications. The reason is that for the multiplication it expects the result to have a scale which is the sum of the two scales of the operands. When there is an overflow in the result of the operations, the result is rounded up and the scale is one less than the expected. In this situation, the result is set to null. > BigDecimal multiplication sometimes returns null > > > Key: SPARK-22036 > URL: https://issues.apache.org/jira/browse/SPARK-22036 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.2.0 >Reporter: Olivier Blanvillain > > The multiplication of two BigDecimal numbers sometimes returns null. This > issue we discovered while doing property based testing for the frameless > project. Here is a minimal reproduction: > {code:java} > object Main extends App { > import org.apache.spark.{SparkConf, SparkContext} > import org.apache.spark.sql.SparkSession > import spark.implicits._ > val conf = new > SparkConf().setMaster("local[*]").setAppName("REPL").set("spark.ui.enabled", > "false") > val spark = > SparkSession.builder().config(conf).appName("REPL").getOrCreate() > implicit val sqlContext = spark.sqlContext > case class X2(a: BigDecimal, b: BigDecimal) > val ds = sqlContext.createDataset(List(X2(BigDecimal(-0.1267333984375), > BigDecimal(-1000.1 > val result = ds.select(ds("a") * ds("b")).collect.head > println(result) // [null] > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22036) BigDecimal multiplication sometimes returns null
[ https://issues.apache.org/jira/browse/SPARK-22036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168994#comment-16168994 ] Olivier Blanvillain commented on SPARK-22036: - This seems to be a multiplication only thing, Spark silently losing precision with addition: {code:java} scala> val a = BigDecimal("43.65") a: scala.math.BigDecimal = 43.65 scala> val b = BigDecimal("61.11") b: scala.math.BigDecimal = 61.11 scala> case class X2(a: BigDecimal, b: BigDecimal) defined class X2 scala> val ds = sqlContenxt.createDataset(List(X2(a, b))) ds: org.apache.spark.sql.Dataset[X2] = [a: decimal(38,18), b: decimal(38,18)] scala> val res = ds.select(ds("a") + ds("b")).head.get(0).asInstanceOf[java.math.BigDecimal] res: java.math.BigDecimal = 104.76 scala> res.subtract((a + b).underlying) res3: java.math.BigDecimal = 0E-18 {code} > BigDecimal multiplication sometimes returns null > > > Key: SPARK-22036 > URL: https://issues.apache.org/jira/browse/SPARK-22036 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.2.0 >Reporter: Olivier Blanvillain > > The multiplication of two BigDecimal numbers sometimes returns null. This > issue we discovered while doing property based testing for the frameless > project. Here is a minimal reproduction: > {code:java} > object Main extends App { > import org.apache.spark.{SparkConf, SparkContext} > import org.apache.spark.sql.SparkSession > import spark.implicits._ > val conf = new > SparkConf().setMaster("local[*]").setAppName("REPL").set("spark.ui.enabled", > "false") > val spark = > SparkSession.builder().config(conf).appName("REPL").getOrCreate() > implicit val sqlContext = spark.sqlContext > case class X2(a: BigDecimal, b: BigDecimal) > val ds = sqlContext.createDataset(List(X2(BigDecimal(-0.1267333984375), > BigDecimal(-1000.1 > val result = ds.select(ds("a") * ds("b")).collect.head > println(result) // [null] > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22036) BigDecimal multiplication sometimes returns null
[ https://issues.apache.org/jira/browse/SPARK-22036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168989#comment-16168989 ] Olivier Blanvillain commented on SPARK-22036: - It's surprising because in this case the resulting value fits within the range of representable values: {code:java} scala> val result = BigDecimal(-0.1267333984375) * BigDecimal(-1000.1) result: scala.math.BigDecimal = 126.74607177734375 scala> sqlContenxt.createDataset(List(result)).head == result res10: Boolean = true {code} Also Spark will silently loses BigDecimal precision in other circumstances: {code:java} scala> val tooPrecise = BigDecimal("126.746071777343751") tooPrecise: scala.math.BigDecimal = 126.746071777343751 scala> val ds = sqlContenxt.createDataset(List(tooPrecise)) ds: org.apache.spark.sql.Dataset[scala.math.BigDecimal] = [value: decimal(38,18)] scala> ds.head res14: scala.math.BigDecimal = 126.74607177734375 scala> ds.select(ds("value") * BigDecimal(1)).head res15: org.apache.spark.sql.Row = [126.74607177734375] {code} > I am not sure of what should be done in this case Given that Sparks' BigDecimal have bounded precision I would consider following that is done for other numeric representations and return the closest representable value in case of overflow. > BigDecimal multiplication sometimes returns null > > > Key: SPARK-22036 > URL: https://issues.apache.org/jira/browse/SPARK-22036 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.2.0 >Reporter: Olivier Blanvillain > > The multiplication of two BigDecimal numbers sometimes returns null. This > issue we discovered while doing property based testing for the frameless > project. Here is a minimal reproduction: > {code:java} > object Main extends App { > import org.apache.spark.{SparkConf, SparkContext} > import org.apache.spark.sql.SparkSession > import spark.implicits._ > val conf = new > SparkConf().setMaster("local[*]").setAppName("REPL").set("spark.ui.enabled", > "false") > val spark = > SparkSession.builder().config(conf).appName("REPL").getOrCreate() > implicit val sqlContext = spark.sqlContext > case class X2(a: BigDecimal, b: BigDecimal) > val ds = sqlContext.createDataset(List(X2(BigDecimal(-0.1267333984375), > BigDecimal(-1000.1 > val result = ds.select(ds("a") * ds("b")).collect.head > println(result) // [null] > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22036) BigDecimal multiplication sometimes returns null
[ https://issues.apache.org/jira/browse/SPARK-22036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168969#comment-16168969 ] Marco Gaido commented on SPARK-22036: - This happens because there is an overflow in the operation. I am not sure of what should be done in this case. The current implementation returns null when an operation cause a loss of precision. > BigDecimal multiplication sometimes returns null > > > Key: SPARK-22036 > URL: https://issues.apache.org/jira/browse/SPARK-22036 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.2.0 >Reporter: Olivier Blanvillain > > The multiplication of two BigDecimal numbers sometimes returns null. This > issue we discovered while doing property based testing for the frameless > project. Here is a minimal reproduction: > {code:java} > object Main extends App { > import org.apache.spark.{SparkConf, SparkContext} > import org.apache.spark.sql.SparkSession > import spark.implicits._ > val conf = new > SparkConf().setMaster("local[*]").setAppName("REPL").set("spark.ui.enabled", > "false") > val spark = > SparkSession.builder().config(conf).appName("REPL").getOrCreate() > implicit val sqlContext = spark.sqlContext > case class X2(a: BigDecimal, b: BigDecimal) > val ds = sqlContext.createDataset(List(X2(BigDecimal(-0.1267333984375), > BigDecimal(-1000.1 > val result = ds.select(ds("a") * ds("b")).collect.head > println(result) // [null] > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org