Github user mgaido91 commented on a diff in the pull request:
https://github.com/apache/spark/pull/23210#discussion_r238471660
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala
---
@@ -1647,6 +1647,15 @@ class DatasetSuite extends QueryTest with
SharedSQLContext {
checkDataset(ds, data: _*)
checkAnswer(ds.select("x"), Seq(Row(1), Row(2)))
}
+
+ test("SPARK-26233: serializer should enforce decimal precision and
scale") {
--- End diff --
Well, everything is possible, but it is not easy actually. Because the
issue here happens in the codegen, not when we retrieve the output. So if we
just encode and decode everything is fine. The problem happens if there is any
transformation in the codegen meanwhile, because there the underlying decimal
is used (assuming that it has the same precision and scale of the data type -
which without the current change is not always true). I tried checking the
precision and scale of the serialized object, but it is not really feasible as
they are converted when it is read (please see `UnsafeRow`)... So I'd avoid
this actually.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]