GitHub user kiszk opened a pull request:
https://github.com/apache/spark/pull/17684
[SPARK-20341][SQL] Support BigInt's value that does not fit in long value
range
## What changes were proposed in this pull request?
This PR avoids an exception in the case where `scala.math.BigInt` has a
value that does not fit into long value range (e.g. `Long.MAX_VALUE+1`). When
we run the following code by using the current Spark, the following exception
is thrown.
This PR keeps the value using `BigDecimal` if we detect such an overflow
case by catching `ArithmeticException`.
Sample program:
```
case class BigIntWrapper(value:scala.math.BigInt)```
spark.createDataset(BigIntWrapper(scala.math.BigInt("10000000000000000002"))::Nil).show
```
Exception:
```
Error while encoding: java.lang.ArithmeticException: BigInteger out of long
range
staticinvoke(class org.apache.spark.sql.types.Decimal$, DecimalType(38,0),
apply, assertnotnull(assertnotnull(input[0, org.apache.spark.sql.BigIntWrapper,
true])).value, true) AS value#0
java.lang.RuntimeException: Error while encoding:
java.lang.ArithmeticException: BigInteger out of long range
staticinvoke(class org.apache.spark.sql.types.Decimal$, DecimalType(38,0),
apply, assertnotnull(assertnotnull(input[0, org.apache.spark.sql.BigIntWrapper,
true])).value, true) AS value#0
at
org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.toRow(ExpressionEncoder.scala:290)
at
org.apache.spark.sql.SparkSession$$anonfun$2.apply(SparkSession.scala:454)
at
org.apache.spark.sql.SparkSession$$anonfun$2.apply(SparkSession.scala:454)
at
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.immutable.List.foreach(List.scala:381)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.immutable.List.map(List.scala:285)
at
org.apache.spark.sql.SparkSession.createDataset(SparkSession.scala:454)
at org.apache.spark.sql.Agg$$anonfun$18.apply$mcV$sp(MySuite.scala:192)
at org.apache.spark.sql.Agg$$anonfun$18.apply(MySuite.scala:192)
at org.apache.spark.sql.Agg$$anonfun$18.apply(MySuite.scala:192)
at
org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
at org.scalatest.Transformer.apply(Transformer.scala:22)
at org.scalatest.Transformer.apply(Transformer.scala:20)
at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166)
at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:68)
at
org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163)
at
org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
at
org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175)
...
Caused by: java.lang.ArithmeticException: BigInteger out of long range
at java.math.BigInteger.longValueExact(BigInteger.java:4531)
at org.apache.spark.sql.types.Decimal.set(Decimal.scala:140)
at org.apache.spark.sql.types.Decimal$.apply(Decimal.scala:434)
at org.apache.spark.sql.types.Decimal.apply(Decimal.scala)
at
org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown
Source)
at
org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.toRow(ExpressionEncoder.scala:287)
... 59 more
```
## How was this patch tested?
Add new test suite into `DecimalSuite`
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/kiszk/spark SPARK-20341
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/17684.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #17684
----
commit 9a15ba217ce442371db885e166f1e464d8eb9cfc
Author: Kazuaki Ishizaki <[email protected]>
Date: 2017-04-19T12:53:01Z
initial commit
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]