laskoviymishka opened a new issue, #2486: URL: https://github.com/apache/iceberg-rust/issues/2486
Follow-up from apache/iceberg-go#1110. iceberg-rust uses `Bucket(u32)` / `Truncate(u32)`, accepting `N` up to `MaxUint32` (4,294,967,295). Java's reference impl uses `int`, capping at `Integer.MAX_VALUE` (2,147,483,647). iceberg-go just landed apache/iceberg-go#1110 which validates `n <= math.MaxInt32` at parse time, aligning with Java. iceberg-cpp uses `int32_t`, so the upper bound matches Java via the type. PyIceberg uses `PositiveInt` with no explicit ceiling. So among the five clients: | client | type | upper bound on `N` | | --- | --- | --- | | Java | `int` | `MaxInt32` (2,147,483,647) | | iceberg-go (post-#1110) | `int` + parse check | `MaxInt32` | | iceberg-cpp | `int32_t` | `INT32_MAX` | | iceberg-rust | `u32` | `MaxUint32` (4,294,967,295) | | PyIceberg | `PositiveInt` | unbounded | The spec doesn't bound `N` explicitly — `Integer.MAX_VALUE` only appears as a mask operand in the bucket formula `(murmur3_x86_32_hash(x) & Integer.MAX_VALUE) % N`, not as a ceiling. So `N > Integer.MAX_VALUE` is undefined per-spec rather than disallowed, but Java is the de facto reference and three of the four other clients match. Worth noting: iceberg-cpp's `TransformFromString` parses via `ParseNumber<int32_t>` (rejects overflow at parse time via the type), but the `Bucket` / `Truncate` constructors don't validate `> 0`, so `bucket[0]` parses cleanly and only fails later at runtime — same shape as pre-#1110 iceberg-go. Probably a separate small issue worth filing on cpp's side. Doesn't bite today — no real table is plausibly partitioned with 2–4 billion buckets. But rust is currently the only client that *could* author a spec in `(MaxInt32, MaxUint32]`, and such a spec would throw `NumberFormatException` in Java and fail to parse in iceberg-go post-#1110. If `u32` is intentional, a doc comment near `Bucket` / `Truncate` capturing the divergence would help future readers not read it as a parity gap. If it's incidental, narrowing to `i32` (or a `NonZero` newtype) would converge with the reference impl. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
