Re: [PR] Improvements to decimal conversion [arrow-dotnet]

via GitHub Sun, 22 Mar 2026 09:54:43 -0700


Copilot commented on code in PR #292:
URL: https://github.com/apache/arrow-dotnet/pull/292#discussion_r2971763383



##########
src/Apache.Arrow/Arrays/Decimal256Array.cs:
##########
@@ -156,6 +156,20 @@ public Decimal256Array(ArrayData data)
             return DecimalUtility.GetDecimal(ValueBuffer, Offset + index, 
Scale, ByteWidth);
         }
 
+        public bool TryGetValue(int index, out decimal? value)
+        {
+            try
+            {
+                value = GetValue(index);
+                return true;
+            }
+            catch (OverflowException)
+            {
+                value = null;
+                return false;
+            }

Review Comment:
   `TryGetValue` is implemented by calling `GetValue` and catching 
`OverflowException`. This makes the "Try" path potentially allocate stack 
traces and can be disproportionately expensive if overflows are common. Prefer 
a non-throwing overflow check/conversion (e.g., `DecimalUtility.TryGetDecimal`) 
and return `false` without relying on exceptions.
   ```suggestion
               if (IsNull(index))
               {
                   value = null;
                   return true;
               }
   
               if (DecimalUtility.TryGetDecimal(ValueBuffer, Offset + index, 
Scale, ByteWidth, out decimal result))
               {
                   value = result;
                   return true;
               }
   
               value = null;
               return false;
   ```



##########
src/Apache.Arrow/DecimalUtility.cs:
##########
@@ -35,10 +38,117 @@ internal static class DecimalUtility
 
         private static int PowersOfTenLength => s_powersOfTen.Length - 1;
 
+#if NET7_0_OR_GREATER
+        // decimal mantissa is 96 bits unsigned
+        private static readonly UInt128 s_maxDecimalMantissa = new 
UInt128(0x0000_0000_FFFF_FFFF, 0xFFFF_FFFF_FFFF_FFFF);
+
+        private static readonly UInt128[] s_uint128PowersOfTen = 
ComputeUInt128Powers();
+
+        private static UInt128[] ComputeUInt128Powers()
+        {
+            var powers = new UInt128[39]; // 10^0 through 10^38
+            powers[0] = 1;
+            for (int i = 1; i < powers.Length; i++)
+                powers[i] = powers[i - 1] * 10;
+            return powers;
+        }
+#endif
+
         internal static decimal GetDecimal(in ArrowBuffer valueBuffer, int 
index, int scale, int byteWidth)
         {
             int startIndex = index * byteWidth;
             ReadOnlySpan<byte> value = valueBuffer.Span.Slice(startIndex, 
byteWidth);
+
+#if NET7_0_OR_GREATER
+            if (byteWidth == 16)
+            {
+                Int128 int128Value = 
BinaryPrimitives.ReadInt128LittleEndian(value);
+                return GetDecimalViaInt128(int128Value, scale);
+            }
+
+            if (byteWidth == 32)
+            {
+                // Check if the value fits in 128 bits (upper 128 bits are 
sign extension)
+                Int128 lower = BinaryPrimitives.ReadInt128LittleEndian(value);
+                Int128 upper = 
BinaryPrimitives.ReadInt128LittleEndian(value.Slice(16));
+                Int128 signExtension = lower < 0 ? Int128.NegativeOne : 
Int128.Zero;
+                if (upper == signExtension)
+                {
+                    return GetDecimalViaInt128(lower, scale);
+                }
+            }
+#endif
+
+            return GetDecimalViaBigInteger(value, scale);
+        }
+
+#if NET7_0_OR_GREATER
+        private static decimal GetDecimalViaInt128(Int128 integerValue, int 
scale)
+        {
+            bool negative = integerValue < 0;
+            UInt128 abs = negative ? (UInt128)(-integerValue) : 
(UInt128)integerValue;
+
+            // Fast path: value fits directly in decimal (96-bit mantissa, 
scale <= 28)
+            if (abs <= s_maxDecimalMantissa && scale <= 28)
+            {
+                return UInt128ToDecimal(abs, negative, (byte)scale);
+            }
+
+            if (scale == 0)
+            {
+                throw new OverflowException($"Value: {integerValue} is too 
large to be represented as a decimal");
+            }
+
+            // Split into integer and fractional parts
+            if (scale <= 38)
+            {
+                UInt128 scaleBy = s_uint128PowersOfTen[scale];
+                (UInt128 intPart, UInt128 fracPart) = UInt128.DivRem(abs, 
scaleBy);
+
+                if (intPart > s_maxDecimalMantissa)
+                {
+                    throw new OverflowException($"Value is too large to be 
represented as a decimal");

Review Comment:
   The new `OverflowException` thrown when `intPart > s_maxDecimalMantissa` 
omits the offending value (unlike the other overflow branches that include it). 
Including `integerValue` (and possibly `scale`) in the message would make 
diagnosing conversion failures much easier.
   ```suggestion
                       throw new OverflowException($"Value: {integerValue} with 
scale: {scale} is too large to be represented as a decimal");
   ```



##########
src/Apache.Arrow/Arrays/Decimal128Array.cs:
##########
@@ -147,6 +147,20 @@ public Decimal128Array(ArrayData data)
             return DecimalUtility.GetDecimal(ValueBuffer, Offset + index, 
Scale, ByteWidth);
         }
 
+        public bool TryGetValue(int index, out decimal? value)
+        {
+            try
+            {
+                value = GetValue(index);
+                return true;
+            }
+            catch (OverflowException)
+            {
+                value = null;
+                return false;
+            }

Review Comment:
   `TryGetValue` uses an exception (OverflowException) for control flow. For a 
"Try*" API this can be very expensive in hot paths and can distort 
performance-sensitive usage. Consider adding a non-throwing conversion path 
(e.g., `DecimalUtility.TryGetDecimal(...)`) that returns `false` on overflow, 
and have `TryGetValue` delegate to that instead of catching exceptions.
   ```suggestion
               if (IsNull(index))
               {
                   value = null;
                   return true;
               }
   
               if (DecimalUtility.TryGetDecimal(ValueBuffer, Offset + index, 
Scale, ByteWidth, out decimal result))
               {
                   value = result;
                   return true;
               }
   
               value = null;
               return false;
   ```



##########
test/Apache.Arrow.Benchmarks/DecimalArrayBenchmark.cs:
##########
@@ -0,0 +1,131 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements. See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+using System;
+using System.Data.SqlTypes;
+using Apache.Arrow.Types;
+using BenchmarkDotNet.Attributes;
+
+namespace Apache.Arrow.Benchmarks
+{
+    [MemoryDiagnoser]
+    public class DecimalArrayBenchmark
+    {
+        [Params(10_000)]
+        public int Count { get; set; }
+
+        private Decimal128Array _decimal128LowScale;
+        private Decimal128Array _decimal128HighScale;
+        private Decimal256Array _decimal256LowScale;
+        private Decimal256Array _decimal256HighScale;
+
+        [GlobalSetup]
+        public void GlobalSetup()
+        {
+            var random = new Random(42);
+
+            _decimal128LowScale = BuildDecimal128Array(new Decimal128Type(14, 
4), random);
+            _decimal128HighScale = BuildDecimal128Array(new Decimal128Type(38, 
20), random);
+            _decimal256LowScale = BuildDecimal256Array(new Decimal256Type(14, 
4), random);
+            _decimal256HighScale = BuildDecimal256Array(new Decimal256Type(76, 
38), random);
+        }
+
+        private Decimal128Array BuildDecimal128Array(Decimal128Type type, 
Random random)
+        {
+            var builder = new Decimal128Array.Builder(type);
+            for (int i = 0; i < Count; i++)
+            {
+                builder.Append((decimal)Math.Round(random.NextDouble() * 
10000, Math.Min(type.Scale, 10)));
+            }
+            return builder.Build();
+        }
+
+        private Decimal256Array BuildDecimal256Array(Decimal256Type type, 
Random random)
+        {
+            var builder = new Decimal256Array.Builder(type);
+            for (int i = 0; i < Count; i++)
+            {
+                builder.Append((decimal)Math.Round(random.NextDouble() * 
10000, Math.Min(type.Scale, 10)));
+            }
+            return builder.Build();
+        }
+
+        [Benchmark]
+        public decimal? Decimal128_GetValue_LowScale()
+        {
+            decimal? sum = 0;
+            for (int i = 0; i < _decimal128LowScale.Length; i++)
+            {
+                sum += _decimal128LowScale.GetValue(i);
+            }
+            return sum;

Review Comment:
   These benchmarks sum `decimal?` values into a `decimal?` accumulator. Since 
the arrays built in `GlobalSetup` contain no nulls, using nullable arithmetic 
adds extra overhead (lifted operators/null checks) and can skew the measurement 
away from the conversion cost. Consider using a non-nullable accumulator (e.g., 
`decimal sum`) and adding `GetValue(i).GetValueOrDefault()` (or otherwise 
guaranteeing non-null) to keep the benchmark focused.



##########
test/Apache.Arrow.Benchmarks/DecimalArrayBenchmark.cs:
##########
@@ -0,0 +1,131 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements. See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+using System;
+using System.Data.SqlTypes;
+using Apache.Arrow.Types;
+using BenchmarkDotNet.Attributes;
+
+namespace Apache.Arrow.Benchmarks
+{
+    [MemoryDiagnoser]
+    public class DecimalArrayBenchmark
+    {
+        [Params(10_000)]
+        public int Count { get; set; }
+
+        private Decimal128Array _decimal128LowScale;
+        private Decimal128Array _decimal128HighScale;
+        private Decimal256Array _decimal256LowScale;
+        private Decimal256Array _decimal256HighScale;
+
+        [GlobalSetup]
+        public void GlobalSetup()
+        {
+            var random = new Random(42);
+
+            _decimal128LowScale = BuildDecimal128Array(new Decimal128Type(14, 
4), random);
+            _decimal128HighScale = BuildDecimal128Array(new Decimal128Type(38, 
20), random);
+            _decimal256LowScale = BuildDecimal256Array(new Decimal256Type(14, 
4), random);
+            _decimal256HighScale = BuildDecimal256Array(new Decimal256Type(76, 
38), random);
+        }
+
+        private Decimal128Array BuildDecimal128Array(Decimal128Type type, 
Random random)
+        {
+            var builder = new Decimal128Array.Builder(type);
+            for (int i = 0; i < Count; i++)
+            {
+                builder.Append((decimal)Math.Round(random.NextDouble() * 
10000, Math.Min(type.Scale, 10)));
+            }
+            return builder.Build();
+        }
+
+        private Decimal256Array BuildDecimal256Array(Decimal256Type type, 
Random random)
+        {
+            var builder = new Decimal256Array.Builder(type);
+            for (int i = 0; i < Count; i++)
+            {
+                builder.Append((decimal)Math.Round(random.NextDouble() * 
10000, Math.Min(type.Scale, 10)));
+            }
+            return builder.Build();
+        }
+
+        [Benchmark]
+        public decimal? Decimal128_GetValue_LowScale()
+        {
+            decimal? sum = 0;
+            for (int i = 0; i < _decimal128LowScale.Length; i++)
+            {
+                sum += _decimal128LowScale.GetValue(i);
+            }
+            return sum;
+        }
+
+        [Benchmark]
+        public decimal? Decimal128_GetValue_HighScale()
+        {
+            decimal? sum = 0;
+            for (int i = 0; i < _decimal128HighScale.Length; i++)
+            {
+                sum += _decimal128HighScale.GetValue(i);
+            }
+            return sum;
+        }
+
+        [Benchmark]
+        public decimal? Decimal256_GetValue_LowScale()
+        {
+            decimal? sum = 0;
+            for (int i = 0; i < _decimal256LowScale.Length; i++)
+            {
+                sum += _decimal256LowScale.GetValue(i);
+            }
+            return sum;
+        }
+
+        [Benchmark]
+        public decimal? Decimal256_GetValue_HighScale()
+        {
+            decimal? sum = 0;
+            for (int i = 0; i < _decimal256HighScale.Length; i++)
+            {
+                sum += _decimal256HighScale.GetValue(i);
+            }
+            return sum;
+        }
+
+        [Benchmark]
+        public SqlDecimal? Decimal128_GetSqlDecimal()
+        {
+            SqlDecimal? sum = 0;
+            for (int i = 0; i < _decimal128LowScale.Length; i++)
+            {
+                sum += _decimal128LowScale.GetSqlDecimal(i);
+            }
+            return sum;
+        }
+
+        [Benchmark]
+        public string Decimal256_GetString_HighScale()
+        {
+            string last = null;
+            for (int i = 0; i < _decimal256HighScale.Length; i++)
+            {
+                last = _decimal256HighScale.GetString(i);
+            }
+            return last;

Review Comment:
   `Decimal256_GetString_HighScale` returns a non-nullable `string`, but `last` 
is initialized to `null` and the method would return `null` if the array length 
were ever 0 (or if `GetString` returned null). Consider changing the return 
type to `string?` or initializing from the first element after asserting 
`Length > 0` to keep nullability/behavior consistent.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] Improvements to decimal conversion [arrow-dotnet]

Reply via email to