Re: [C++] Decimal arithmetic edge cases

David Li Fri, 08 Oct 2021 05:48:16 -0700

At max precision, we currently error out regardless of scale.

It looks like SQL Server not only saturates at max precision, but may also 
reduce scale. I think we can provide that as an option if there's demand for 
it. But we can at least promote decimal128 to decimal256 and perform overflow 
checks.


Thanks,
David

On Fri, Oct 8, 2021, at 00:50, Micah Kornfield wrote:
> Do we error out regardless of scale, or does rounding occur with scale > 0?
> 
> Arguably these last three cases should be:
> > "1" (decimal128(38, 0)) + "1" (decimal128(38, 0)) = "2" (decimal256(39,
> > 0)) (promote to decimal256)
> > "1" (decimal256(76, 0)) + "1" (decimal256(76, 0)) = "2" (decimal256(76,
> > 0)) (saturate at max precision)
> > "99...9 (76 digits)" (decimal256(76, 0)) + "1" (decimal256(76, 0)) = error
> > (overflow)
> 
> 
> This seems more or less what I expect.
> 
> On Thu, Sep 30, 2021 at 7:03 AM Keith Kraus <keith.j.kr...@gmail.com> wrote:
> 
> > For another point of reference, here's microsoft's docs for SQL server on
> > resulting precision and scale for different operators including its
> > overflow rules:
> >
> > https://docs.microsoft.com/en-us/sql/t-sql/data-types/precision-scale-and-length-transact-sql?view=sql-server-ver15
> >
> > -Keith
> >
> > On Thu, Sep 30, 2021 at 9:42 AM David Li <lidav...@apache.org> wrote:
> >
> > > Hello all,
> > >
> > > While looking at decimal arithmetic kernels in ARROW-13130, the question
> > > of what to do about overflow came up.
> > >
> > > Currently, our rules are based on Redshift [1], except we raise an error
> > > if we exceed the maximum precision (Redshift's docs implies it saturates
> > > instead). Hence, we can always add/subtract/etc. without checking for
> > > overflow, but we can't do things like add two decimal256(76, 0) since
> > > there's no more precision available.
> > >
> > > If we were to support this last case, what would people expect the
> > > unchecked arithmetic kernels to do on overflow? For integers, we wrap
> > > around, but this doesn't really make sense for decimals; we could also
> > > return nulls, or just always raise an error (this seems the most
> > reasonable
> > > to me). Any thoughts?
> > >
> > > For reference, for an unchecked add, currently we have:
> > > "1" (decimal256(75, 0)) + "1" (decimal256(75, 0)) = "2" (decimal256(76,
> > 0))
> > > "1" (decimal128(38, 0)) + "1" (decimal128(38, 0)) = error (not enough
> > > precision)
> > > "1" (decimal256(76, 0)) + "1" (decimal256(76, 0)) = error (not enough
> > > precision)
> > > "99...9 (76 digits)" (decimal256(76, 0)) + "1" (decimal256(76, 0)) =
> > error
> > > (not enough precision)
> > >
> > > Arguably these last three cases should be:
> > > "1" (decimal128(38, 0)) + "1" (decimal128(38, 0)) = "2" (decimal256(39,
> > > 0)) (promote to decimal256)
> > > "1" (decimal256(76, 0)) + "1" (decimal256(76, 0)) = "2" (decimal256(76,
> > > 0)) (saturate at max precision)
> > > "99...9 (76 digits)" (decimal256(76, 0)) + "1" (decimal256(76, 0)) =
> > error
> > > (overflow)
> > >
> > > On a related note, you could also argue that we shouldn't increase the
> > > precision like this, though DBs other than Redshift also do this. Playing
> > > with DuckDB a bit, though, it doesn't match Redshift:
> > addition/subtraction
> > > increase precision by 1 like Redshift does, but division results in a
> > > float, and multiplication only adds the input precisions together, while
> > > Redshift adds 1 to the sum of the precisions. (That is, decimal128(3, 0)
> > *
> > > decimal128(3, 0) is decimal128(7, 0) in Redshift/Arrow but decimal128(6,
> > 0)
> > > in DuckDB.)
> > >
> > > [1]:
> > >
> > https://docs.aws.amazon.com/redshift/latest/dg/r_numeric_computations201.html
> > >
> > > -David
> >
>

Re: [C++] Decimal arithmetic edge cases

Reply via email to