zeroshade commented on code in PR #1355:
URL: https://github.com/apache/iceberg-go/pull/1355#discussion_r3521437645


##########
manifest.go:
##########
@@ -1848,18 +1864,28 @@ func convertTimestampMicrosValue(v any) any {
        return v
 }
 
-func convertDecimalValue(v any) any {
-       if dec, ok := v.(Decimal); ok {
-               fixedSize := internal.DecimalRequiredBytes(len(dec.String()))
-               bytes, err := DecimalLiteral(dec).MarshalBinary()
-               if err != nil {
-                       return v
-               }
+func convertDecimalValue(v any, fixedSize int) (any, error) {
+       switch dec := v.(type) {
+       case Decimal:
+               return encodeDecimalBytes(DecimalLiteral(dec), fixedSize)
+       case DecimalLiteral:
+               return encodeDecimalBytes(dec, fixedSize)
+       default:
+               return v, nil
+       }
+}
 
-               return padOrTruncateBytes(bytes, fixedSize)
+func encodeDecimalBytes(dec DecimalLiteral, fixedSize int) ([]byte, error) {
+       if fixedSize <= 0 {
+               return nil, fmt.Errorf("missing decimal fixed size")
        }
 
-       return v
+       bytes, err := dec.MarshalBinary()
+       if err != nil {
+               return nil, err
+       }
+
+       return padOrTruncateBytes(bytes, fixedSize), nil

Review Comment:
   [BLOCKER] This call still routes through `padOrTruncateBytes`, whose 
implementation at lines 1899-1905 left-pads short decimal byte slices with 
`0x00` unconditionally. That is only correct for non-negative values; a 
negative unscaled value such as -1 for precision 10 (5 bytes) becomes `00 00 00 
00 ff` and decodes as +255. Fix with sign-aware fixed-width encoding: when 
`len(bytes) < fixedSize`, prepend `0xff` if `bytes[0]&0x80 != 0`, otherwise 
`0x00` (two's-complement sign extension), and add 
negative/zero/max-precision-38 vectors.



##########
manifest_test.go:
##########
@@ -2044,6 +2046,25 @@ func 
TestReadManifestDecodesNilLogicalPartitionValueFromNullableUnion(t *testing
        }
 }
 
+func TestAvroEncodePartitionDataUsesDeclaredDecimalFixedSize(t *testing.T) {

Review Comment:
   [MINOR] This test only covers one positive decimal, so it misses the 
negative sign-extension bug and oversized-value truncation. Please make this 
table-driven with positive, zero, negative, max-width, and too-large vectors, 
and include a `ManifestWriter`/`ReadManifest` round-trip.



##########
manifest.go:
##########
@@ -1848,18 +1864,28 @@ func convertTimestampMicrosValue(v any) any {
        return v
 }
 
-func convertDecimalValue(v any) any {
-       if dec, ok := v.(Decimal); ok {
-               fixedSize := internal.DecimalRequiredBytes(len(dec.String()))
-               bytes, err := DecimalLiteral(dec).MarshalBinary()
-               if err != nil {
-                       return v
-               }
+func convertDecimalValue(v any, fixedSize int) (any, error) {
+       switch dec := v.(type) {
+       case Decimal:
+               return encodeDecimalBytes(DecimalLiteral(dec), fixedSize)
+       case DecimalLiteral:
+               return encodeDecimalBytes(dec, fixedSize)
+       default:
+               return v, nil
+       }
+}
 
-               return padOrTruncateBytes(bytes, fixedSize)
+func encodeDecimalBytes(dec DecimalLiteral, fixedSize int) ([]byte, error) {
+       if fixedSize <= 0 {
+               return nil, fmt.Errorf("missing decimal fixed size")
        }
 
-       return v
+       bytes, err := dec.MarshalBinary()
+       if err != nil {
+               return nil, err
+       }
+
+       return padOrTruncateBytes(bytes, fixedSize), nil

Review Comment:
   [MAJOR] `padOrTruncateBytes` also silently truncates oversized decimals to 
the low-order bytes (see real lines 1899-1901), which can change the value 
instead of reporting that it does not fit the declared fixed size. Since this 
path already returns an error, make `len(bytes) > fixedSize` verify the dropped 
prefix is redundant sign extension and otherwise return an overflow/precision 
error.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to