[GitHub] spark pull request #23210: [SPARK-26233][SQL] CheckOverflow when encoding a ...

mgaido91 Mon, 03 Dec 2018 14:59:11 -0800

Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23210#discussion_r238471660
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala 
---
    @@ -1647,6 +1647,15 @@ class DatasetSuite extends QueryTest with 
SharedSQLContext {
         checkDataset(ds, data: _*)
         checkAnswer(ds.select("x"), Seq(Row(1), Row(2)))
       }
    +
    +  test("SPARK-26233: serializer should enforce decimal precision and 
scale") {
    --- End diff --
    
    Well, everything is possible, but it is not easy actually. Because the 
issue here happens in the codegen, not when we retrieve the output. So if we 
just encode and decode everything is fine. The problem happens if there is any 
transformation in the codegen meanwhile, because there the underlying decimal 
is used (assuming that it has the same precision and scale of the data type - 
which without the current change is not always true). I tried checking the 
precision and scale of the serialized object, but it is not really feasible as 
they are converted when it is read (please see `UnsafeRow`)... So I'd avoid 
this actually.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #23210: [SPARK-26233][SQL] CheckOverflow when encoding a ...

Reply via email to