mzzz-zzm commented on issue #1027:
URL: https://github.com/apache/iceberg-go/issues/1027#issuecomment-4403317385
While digging into this, I noticed the precision/size confusion isn't
isolated to `convertDecimalValue` — there's a sibling issue in `getFieldIDMap`
that I think is worth resolving in the same patch, since a clean fix for #1027
needs to thread the right metadata through anyway.
In `manifest.go`, `getFieldIDMap` builds a `fixedSizes` map but actually
populates it with the field's scale:
```go
fixedSizes := make(map[int]int)
// ...
if typ.LogicalType == atype.Decimal {
fixedSizes[fid] = typ.Scale // name says "size", value is scale
}
```
This map is then plumbed through `hasFieldToIDMap.setFieldIDToFixedSizeMap`
into `dataFile.fieldIDToFixedSize`, and on the read path
`convertAvroValueToIcebergType` consumes it as scale:
```go
case atype.Decimal:
if r, ok := v.(*big.Rat); ok {
scale := d.fieldIDToFixedSize[fieldID] // read back as scale
// ...
}
```
So the read path happens to work because both ends agree to (mis)use the
field as scale, but:
1. The name `fieldIDToFixedSize` / `setFieldIDToFixedSizeMap` is actively
misleading — it's really `fieldIDToScale`.
2. It also means we currently have **no** plumbed-through value for the
actual fixed byte size (or for precision) on the write path, which is exactly
what `convertDecimalValue` needs to fix #1027.
### Suggestion
Rather than just adding a `precision int` parameter to
`convertDecimalValue`, it would be cleaner to:
1. Rename `fieldIDToFixedSize` → `fieldIDToScale` (and the setter
accordingly), since that's what it actually holds. This is a localized rename —
the field, its setter on `hasFieldToIDMap`, the local in `getFieldIDMap`, and
the read in `convertAvroValueToIcebergType`.
2. Add a separate `fieldIDToPrecision map[int]int]` populated from
`typ.Precision` in `getFieldIDMap`, plumbed through `hasFieldToIDMap` the same
way.
3. Have `avroPartitionData` / `convertLogicalTypeValue` pass `precision`
into `convertDecimalValue`, which then calls
`internal.DecimalRequiredBytes(precision)` — exactly the fix proposed in the
original report, just sourced from real schema metadata rather than
`len(dec.String())`.
Happy to throw up a PR if you like this shape. Just wanted to flag the
rename first — it's not strictly needed to fix the encoding error, but if we
add precision plumbing right next to a field called `fixedSize` that actually
holds scale, things are gonna get confusing fast.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]