[jira] [Commented] (SPARK-12128) Multiplication on decimals in dataframe returns null

2015-12-04 Thread kevin yu (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-12128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15041954#comment-15041954
 ] 

kevin yu commented on SPARK-12128:
--

Hello Philip: Thanks for reporting this problem, this looks like bug for me. I 
can recreate the problem also. Are you planning to fix this problem? If not, I 
can look into the code. Thanks.

> Multiplication on decimals in dataframe returns null
> 
>
> Key: SPARK-12128
> URL: https://issues.apache.org/jira/browse/SPARK-12128
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 1.5.0, 1.5.1, 1.5.2
> Environment: Scala 2.11/Spark 1.5.0/1.5.1/1.5.2
>Reporter: Philip Dodds
>
> I hit a weird issue when I tried to multiply to decimals in a select (either 
> in scala or as SQL), and Im assuming I must be missing the point.
> The issue is fairly easy to recreate with something like the following:
> {code:java}
> val sqlContext = new org.apache.spark.sql.SQLContext(sc)
> import sqlContext.implicits._
> import org.apache.spark.sql.types.Decimal
> case class Trade(quantity: Decimal,price: Decimal)
> val data = Seq.fill(100) {
>   val price = Decimal(20+scala.util.Random.nextInt(10))
> val quantity = Decimal(20+scala.util.Random.nextInt(10))
>   Trade(quantity, price)
> }
> val trades = sc.parallelize(data).toDF()
> trades.registerTempTable("trades")
> trades.select(trades("price")*trades("quantity")).show
> sqlContext.sql("select 
> price/quantity,price*quantity,price+quantity,price-quantity from trades").show
> {code}
> The odd part is if you run it you will see that the addition/division and 
> subtraction works but the multiplication returns a null.
> Tested on 1.5.1/1.5.2 (Scala 2.10 and 2.11)
> ie. 
> {code}
> +--+
> |(price * quantity)|
> +--+
> |  null|
> |  null|
> |  null|
> |  null|
> |  null|
> +--+
> +++++
> | _c0| _c1| _c2| _c3|
> +++++
> |0.952380952380952381|null|41.00...|-1.00...|
> |1.380952380952380952|null|50.00...|8.00|
> |1.272727272727272727|null|50.00...|6.00|
> |0.83|null|44.00...|-4.00...|
> |1.00|null|58.00...|   0E-18|
> +++++
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-12128) Multiplication on decimals in dataframe returns null

2015-12-04 Thread kevin yu (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-12128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15042041#comment-15042041
 ] 

kevin yu commented on SPARK-12128:
--

Hello Philip: I see, yah, seems other DBs could happen also. Thanks. 

> Multiplication on decimals in dataframe returns null
> 
>
> Key: SPARK-12128
> URL: https://issues.apache.org/jira/browse/SPARK-12128
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 1.5.0, 1.5.1, 1.5.2
> Environment: Scala 2.11/Spark 1.5.0/1.5.1/1.5.2
>Reporter: Philip Dodds
>
> I hit a weird issue when I tried to multiply to decimals in a select (either 
> in scala or as SQL), and Im assuming I must be missing the point.
> The issue is fairly easy to recreate with something like the following:
> {code:java}
> val sqlContext = new org.apache.spark.sql.SQLContext(sc)
> import sqlContext.implicits._
> import org.apache.spark.sql.types.Decimal
> case class Trade(quantity: Decimal,price: Decimal)
> val data = Seq.fill(100) {
>   val price = Decimal(20+scala.util.Random.nextInt(10))
> val quantity = Decimal(20+scala.util.Random.nextInt(10))
>   Trade(quantity, price)
> }
> val trades = sc.parallelize(data).toDF()
> trades.registerTempTable("trades")
> trades.select(trades("price")*trades("quantity")).show
> sqlContext.sql("select 
> price/quantity,price*quantity,price+quantity,price-quantity from trades").show
> {code}
> The odd part is if you run it you will see that the addition/division and 
> subtraction works but the multiplication returns a null.
> Tested on 1.5.1/1.5.2 (Scala 2.10 and 2.11)
> ie. 
> {code}
> +--+
> |(price * quantity)|
> +--+
> |  null|
> |  null|
> |  null|
> |  null|
> |  null|
> +--+
> +++++
> | _c0| _c1| _c2| _c3|
> +++++
> |0.952380952380952381|null|41.00...|-1.00...|
> |1.380952380952380952|null|50.00...|8.00|
> |1.272727272727272727|null|50.00...|6.00|
> |0.83|null|44.00...|-4.00...|
> |1.00|null|58.00...|   0E-18|
> +++++
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-12128) Multiplication on decimals in dataframe returns null

2015-12-04 Thread Philip Dodds (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-12128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15041986#comment-15041986
 ] 

Philip Dodds commented on SPARK-12128:
--

Hey [~kevinyu98],  actually I closed the issue since it was actually down to 
the precision doubling up and thus we get a null due to a overflow.

I'm not sure if there is a fix,  since it is all down to the default precision 
and scale that we set-up,  which sort of stops you multiplying them together.   
Similar things happen in other DB's but it just took me a while to work it out.


> Multiplication on decimals in dataframe returns null
> 
>
> Key: SPARK-12128
> URL: https://issues.apache.org/jira/browse/SPARK-12128
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 1.5.0, 1.5.1, 1.5.2
> Environment: Scala 2.11/Spark 1.5.0/1.5.1/1.5.2
>Reporter: Philip Dodds
>
> I hit a weird issue when I tried to multiply to decimals in a select (either 
> in scala or as SQL), and Im assuming I must be missing the point.
> The issue is fairly easy to recreate with something like the following:
> {code:java}
> val sqlContext = new org.apache.spark.sql.SQLContext(sc)
> import sqlContext.implicits._
> import org.apache.spark.sql.types.Decimal
> case class Trade(quantity: Decimal,price: Decimal)
> val data = Seq.fill(100) {
>   val price = Decimal(20+scala.util.Random.nextInt(10))
> val quantity = Decimal(20+scala.util.Random.nextInt(10))
>   Trade(quantity, price)
> }
> val trades = sc.parallelize(data).toDF()
> trades.registerTempTable("trades")
> trades.select(trades("price")*trades("quantity")).show
> sqlContext.sql("select 
> price/quantity,price*quantity,price+quantity,price-quantity from trades").show
> {code}
> The odd part is if you run it you will see that the addition/division and 
> subtraction works but the multiplication returns a null.
> Tested on 1.5.1/1.5.2 (Scala 2.10 and 2.11)
> ie. 
> {code}
> +--+
> |(price * quantity)|
> +--+
> |  null|
> |  null|
> |  null|
> |  null|
> |  null|
> +--+
> +++++
> | _c0| _c1| _c2| _c3|
> +++++
> |0.952380952380952381|null|41.00...|-1.00...|
> |1.380952380952380952|null|50.00...|8.00|
> |1.272727272727272727|null|50.00...|6.00|
> |0.83|null|44.00...|-4.00...|
> |1.00|null|58.00...|   0E-18|
> +++++
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-12128) Multiplication on decimals in dataframe returns null

2015-12-03 Thread Philip Dodds (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-12128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15037839#comment-15037839
 ] 

Philip Dodds commented on SPARK-12128:
--

I tried to add a testcase,  but when I do it using a simplified approach 

{code}
test("SPARK-12128: Multiplication of decimals in dataframe returning null") {
withTempTable("t") {
  Seq((Decimal(2), Decimal(2)), (Decimal(3), Decimal(3))).toDF("a", 
"b").registerTempTable("t")
  checkAnswer(sql("SELECT a*b FROM t"),
Seq(Row(Decimal(4.0).toBigDecimal),
Row(Decimal(9.0).toBigDecimal)))
}
  }
{code}

It then appears to work,  though i did need to covert the target result to big 
decimal

> Multiplication on decimals in dataframe returns null
> 
>
> Key: SPARK-12128
> URL: https://issues.apache.org/jira/browse/SPARK-12128
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 1.5.0, 1.5.1, 1.5.2
> Environment: Scala 2.11/Spark 1.5.0/1.5.1/1.5.2
>Reporter: Philip Dodds
>
> I hit a weird issue when I tried to multiply to decimals in a select (either 
> in scala or as SQL), and Im assuming I must be missing the point.
> The issue is fairly easy to recreate with something like the following:
> {code:java}
> val sqlContext = new org.apache.spark.sql.SQLContext(sc)
> import sqlContext.implicits._
> import org.apache.spark.sql.types.Decimal
> case class Trade(quantity: Decimal,price: Decimal)
> val data = Seq.fill(100) {
>   val price = Decimal(20+scala.util.Random.nextInt(10))
> val quantity = Decimal(20+scala.util.Random.nextInt(10))
>   Trade(quantity, price)
> }
> val trades = sc.parallelize(data).toDF()
> trades.registerTempTable("trades")
> trades.select(trades("price")*trades("quantity")).show
> sqlContext.sql("select 
> price/quantity,price*quantity,price+quantity,price-quantity from trades").show
> {code}
> The odd part is if you run it you will see that the addition/division and 
> subtraction works but the multiplication returns a null.
> Tested on 1.5.1/1.5.2 (Scala 2.10 and 2.11)
> ie. 
> {code}
> +--+
> |(price * quantity)|
> +--+
> |  null|
> |  null|
> |  null|
> |  null|
> |  null|
> +--+
> +++++
> | _c0| _c1| _c2| _c3|
> +++++
> |0.952380952380952381|null|41.00...|-1.00...|
> |1.380952380952380952|null|50.00...|8.00|
> |1.272727272727272727|null|50.00...|6.00|
> |0.83|null|44.00...|-4.00...|
> |1.00|null|58.00...|   0E-18|
> +++++
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-12128) Multiplication on decimals in dataframe returns null

2015-12-03 Thread Philip Dodds (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-12128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15038104#comment-15038104
 ] 

Philip Dodds commented on SPARK-12128:
--

I took a look at the plan

{code}
Results do not match for query:
== Parsed Logical Plan ==
'Project [unresolvedalias(('a * 'b))]
+- 'UnresolvedRelation `t`, None

== Analyzed Logical Plan ==
_c0: decimal(38,36)
Project [CheckOverflow((promote_precision(cast(a#45 as decimal(38,18))) * 
promote_precision(cast(b#46 as decimal(38,18, DecimalType(38,36)) AS _c0#47]
+- Subquery t
   +- Project [_1#43 AS a#45,_2#44 AS b#46]
  +- LocalRelation [_1#43,_2#44], 
[[20.00,20.00],[20.00,20.00]]

== Optimized Logical Plan ==
LocalRelation [_c0#47], [[null],[null]]

== Physical Plan ==
LocalTableScan [_c0#47], [[null],[null]]
== Results ==
!== Correct Answer - 2 ==   == Spark Answer - 2 ==
![400.0]  [null]
![400.0]  [null]
  
{code}

The scale appears to have doubled and thus if the result has a more than two 
digits to the right of the point it'll end up as an overflow.   I'm not sure if 
it is the way I'm using the decimal or the code that is wrong?


> Multiplication on decimals in dataframe returns null
> 
>
> Key: SPARK-12128
> URL: https://issues.apache.org/jira/browse/SPARK-12128
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 1.5.0, 1.5.1, 1.5.2
> Environment: Scala 2.11/Spark 1.5.0/1.5.1/1.5.2
>Reporter: Philip Dodds
>
> I hit a weird issue when I tried to multiply to decimals in a select (either 
> in scala or as SQL), and Im assuming I must be missing the point.
> The issue is fairly easy to recreate with something like the following:
> {code:java}
> val sqlContext = new org.apache.spark.sql.SQLContext(sc)
> import sqlContext.implicits._
> import org.apache.spark.sql.types.Decimal
> case class Trade(quantity: Decimal,price: Decimal)
> val data = Seq.fill(100) {
>   val price = Decimal(20+scala.util.Random.nextInt(10))
> val quantity = Decimal(20+scala.util.Random.nextInt(10))
>   Trade(quantity, price)
> }
> val trades = sc.parallelize(data).toDF()
> trades.registerTempTable("trades")
> trades.select(trades("price")*trades("quantity")).show
> sqlContext.sql("select 
> price/quantity,price*quantity,price+quantity,price-quantity from trades").show
> {code}
> The odd part is if you run it you will see that the addition/division and 
> subtraction works but the multiplication returns a null.
> Tested on 1.5.1/1.5.2 (Scala 2.10 and 2.11)
> ie. 
> {code}
> +--+
> |(price * quantity)|
> +--+
> |  null|
> |  null|
> |  null|
> |  null|
> |  null|
> +--+
> +++++
> | _c0| _c1| _c2| _c3|
> +++++
> |0.952380952380952381|null|41.00...|-1.00...|
> |1.380952380952380952|null|50.00...|8.00|
> |1.272727272727272727|null|50.00...|6.00|
> |0.83|null|44.00...|-4.00...|
> |1.00|null|58.00...|   0E-18|
> +++++
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-12128) Multiplication on decimals in dataframe returns null

2015-12-03 Thread Philip Dodds (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-12128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15038068#comment-15038068
 ] 

Philip Dodds commented on SPARK-12128:
--

Actually that was a little too simple,  I went back to the use case and tried 
this and again it fails

A simple failing test case would be

{code}

test("SPARK-12128: Multiplication of decimals in dataframe returning null") {
withTempTable("t") {
  Seq((Decimal(20), Decimal(20)), (Decimal(20), Decimal(20))).toDF("a", 
"b").registerTempTable("t")
  checkAnswer(sql("SELECT a*b FROM t"),
Seq(Row(Decimal(4.0).toBigDecimal),
Row(Decimal(9.0).toBigDecimal)))
}
  }

{code}

> Multiplication on decimals in dataframe returns null
> 
>
> Key: SPARK-12128
> URL: https://issues.apache.org/jira/browse/SPARK-12128
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 1.5.0, 1.5.1, 1.5.2
> Environment: Scala 2.11/Spark 1.5.0/1.5.1/1.5.2
>Reporter: Philip Dodds
>
> I hit a weird issue when I tried to multiply to decimals in a select (either 
> in scala or as SQL), and Im assuming I must be missing the point.
> The issue is fairly easy to recreate with something like the following:
> {code:java}
> val sqlContext = new org.apache.spark.sql.SQLContext(sc)
> import sqlContext.implicits._
> import org.apache.spark.sql.types.Decimal
> case class Trade(quantity: Decimal,price: Decimal)
> val data = Seq.fill(100) {
>   val price = Decimal(20+scala.util.Random.nextInt(10))
> val quantity = Decimal(20+scala.util.Random.nextInt(10))
>   Trade(quantity, price)
> }
> val trades = sc.parallelize(data).toDF()
> trades.registerTempTable("trades")
> trades.select(trades("price")*trades("quantity")).show
> sqlContext.sql("select 
> price/quantity,price*quantity,price+quantity,price-quantity from trades").show
> {code}
> The odd part is if you run it you will see that the addition/division and 
> subtraction works but the multiplication returns a null.
> Tested on 1.5.1/1.5.2 (Scala 2.10 and 2.11)
> ie. 
> {code}
> +--+
> |(price * quantity)|
> +--+
> |  null|
> |  null|
> |  null|
> |  null|
> |  null|
> +--+
> +++++
> | _c0| _c1| _c2| _c3|
> +++++
> |0.952380952380952381|null|41.00...|-1.00...|
> |1.380952380952380952|null|50.00...|8.00|
> |1.272727272727272727|null|50.00...|6.00|
> |0.83|null|44.00...|-4.00...|
> |1.00|null|58.00...|   0E-18|
> +++++
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-12128) Multiplication on decimals in dataframe returns null

2015-12-03 Thread Philip Dodds (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-12128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15037933#comment-15037933
 ] 

Philip Dodds commented on SPARK-12128:
--

The problem is with the Decimal constructor I believe,  if you use

{code}
val data = Seq.fill(5) {
 Trade(Decimal(BigDecimal(5),38,20), Decimal(BigDecimal(5),38,20))
   }
{code}

Then you will get the correct result,   can probably just close the issue

> Multiplication on decimals in dataframe returns null
> 
>
> Key: SPARK-12128
> URL: https://issues.apache.org/jira/browse/SPARK-12128
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 1.5.0, 1.5.1, 1.5.2
> Environment: Scala 2.11/Spark 1.5.0/1.5.1/1.5.2
>Reporter: Philip Dodds
>
> I hit a weird issue when I tried to multiply to decimals in a select (either 
> in scala or as SQL), and Im assuming I must be missing the point.
> The issue is fairly easy to recreate with something like the following:
> {code:java}
> val sqlContext = new org.apache.spark.sql.SQLContext(sc)
> import sqlContext.implicits._
> import org.apache.spark.sql.types.Decimal
> case class Trade(quantity: Decimal,price: Decimal)
> val data = Seq.fill(100) {
>   val price = Decimal(20+scala.util.Random.nextInt(10))
> val quantity = Decimal(20+scala.util.Random.nextInt(10))
>   Trade(quantity, price)
> }
> val trades = sc.parallelize(data).toDF()
> trades.registerTempTable("trades")
> trades.select(trades("price")*trades("quantity")).show
> sqlContext.sql("select 
> price/quantity,price*quantity,price+quantity,price-quantity from trades").show
> {code}
> The odd part is if you run it you will see that the addition/division and 
> subtraction works but the multiplication returns a null.
> Tested on 1.5.1/1.5.2 (Scala 2.10 and 2.11)
> ie. 
> {code}
> +--+
> |(price * quantity)|
> +--+
> |  null|
> |  null|
> |  null|
> |  null|
> |  null|
> +--+
> +++++
> | _c0| _c1| _c2| _c3|
> +++++
> |0.952380952380952381|null|41.00...|-1.00...|
> |1.380952380952380952|null|50.00...|8.00|
> |1.272727272727272727|null|50.00...|6.00|
> |0.83|null|44.00...|-4.00...|
> |1.00|null|58.00...|   0E-18|
> +++++
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org