mzzz-zzm commented on issue #1027:
URL: https://github.com/apache/iceberg-go/issues/1027#issuecomment-4403317385

   While digging into this, I noticed the precision/size confusion isn't 
isolated to `convertDecimalValue` — there's a sibling issue in `getFieldIDMap` 
that I think is worth resolving in the same patch, since a clean fix for #1027 
needs to thread the right metadata through anyway.
   
   In `manifest.go`, `getFieldIDMap` builds a `fixedSizes` map but actually 
populates it with the field's scale:
   
   ```go
   fixedSizes := make(map[int]int)
   // ...
   if typ.LogicalType == atype.Decimal {
       fixedSizes[fid] = typ.Scale   // name says "size", value is scale
   }
   ```
   
   This map is then plumbed through `hasFieldToIDMap.setFieldIDToFixedSizeMap` 
into `dataFile.fieldIDToFixedSize`, and on the read path 
`convertAvroValueToIcebergType` consumes it as scale:
   
   ```go
   case atype.Decimal:
       if r, ok := v.(*big.Rat); ok {
           scale := d.fieldIDToFixedSize[fieldID]   // read back as scale
           // ...
       }
   ```
   
   So the read path happens to work because both ends agree to (mis)use the 
field as scale, but:
   1. The name `fieldIDToFixedSize` / `setFieldIDToFixedSizeMap` is actively 
misleading — it's really `fieldIDToScale`.
   2. It also means we currently have **no** plumbed-through value for the 
actual fixed byte size (or for precision) on the write path, which is exactly 
what `convertDecimalValue` needs to fix #1027.
   
   ### Suggestion
   Rather than just adding a `precision int` parameter to 
`convertDecimalValue`, it would be cleaner to:
   1. Rename `fieldIDToFixedSize` → `fieldIDToScale` (and the setter 
accordingly), since that's what it actually holds. This is a localized rename — 
the field, its setter on `hasFieldToIDMap`, the local in `getFieldIDMap`, and 
the read in `convertAvroValueToIcebergType`.
   2. Add a separate `fieldIDToPrecision map[int]int]` populated from 
`typ.Precision` in `getFieldIDMap`, plumbed through `hasFieldToIDMap` the same 
way.
   3. Have `avroPartitionData` / `convertLogicalTypeValue` pass `precision` 
into `convertDecimalValue`, which then calls 
`internal.DecimalRequiredBytes(precision)` — exactly the fix proposed in the 
original report, just sourced from real schema metadata rather than 
`len(dec.String())`.
   
   Happy to throw up a PR if you like this shape. Just wanted to flag the 
rename first — it's not strictly needed to fix the encoding error, but if we 
add precision plumbing right next to a field called `fixedSize` that actually 
holds scale, things are gonna get confusing fast.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to