shyjsarah opened a new pull request, #339:
URL: https://github.com/apache/paimon-rust/pull/339

   ### Purpose
   
     When `paimon-rust` is used as the REST client, `CreateTableRequest` 
payloads carrying a `STRING` column are rejected by the REST server with:
   
     Could not parse type at position N: ...
      Input type string: VARCHAR(4294967295)
   
     Root cause: `VarCharType::MAX_LENGTH` (and `VarBinaryType::MAX_LENGTH`) 
were defined as `isize::MAX as u32`. On 64-bit targets `isize::MAX = 2^63 - 1`, 
which truncates to `u32::MAX
      = 4294967295` when cast to `u32`. `VarCharType::string_type()` therefore 
produced a `VarCharType` whose `Display` emits `VARCHAR(4294967295)`. The 
server-side `DataTypeJsonParser` calls `Integer.parseInt` on the length token 
and throws `NumberFormatException`, since the value exceeds Java's 
`Integer.MAX_VALUE` (2147483647).
   
     Java↔Java traffic does not surface this because Java's 
`VarCharType.asSQLString()` short-circuits `length == MAX_LENGTH` to the bare 
`STRING` alias — only the Rust client wrote the numeric form, exposing the 
off-by-one against Java's `int` range.
   
     The wire-format length cap is a protocol constant, not a function of host 
pointer width. The fix pins both `MAX_LENGTH` constants to `i32::MAX as u32 = 
2147483647`, exactly matching Java `Integer.MAX_VALUE` on all targets (32-bit 
was already correct by coincidence; 64-bit was broken).
   
     ### Brief change log
   
     - `crates/paimon/src/spec/types.rs`
       - `VarCharType::MAX_LENGTH`: `isize::MAX as u32` → `i32::MAX as u32`, 
with a comment recording why this must equal Java `Integer.MAX_VALUE`.
       - `VarBinaryType::MAX_LENGTH`: same change, same reason.
     - Added regression test `test_max_length_fits_java_integer` asserting:
       - Both constants equal `i32::MAX as u32`.
       - `VarCharType::string_type().to_string()` and `VarBinaryType` at 
`MAX_LENGTH` produce length tokens that parse as Java `i32`.
   
     ### Tests
   
     - New unit test: 
`paimon::spec::types::tests::test_max_length_fits_java_integer` covers the 
wire-format invariant.
     - Existing fixture-driven tests (`test_data_type_serialize`, 
`test_data_type_deserialize`) continue to pass — none of the fixtures used the 
old overflow value, so no fixture updates were required.
     - Manual end-to-end repro (Rust client → REST `CreateTable` → server) on 
the reporter's side is the verification path; reviewers can synthesize the same 
payload by serializing a schema containing `VarCharType::string_type()` and 
round-tripping through `DataTypeJsonParser`.
   
     ### API and Format
   
     - No public-API signature change. `VarCharType::MAX_LENGTH` and 
`VarBinaryType::MAX_LENGTH` are `pub const` values that are now smaller 
(`2147483647` instead of the truncated `4294967295`). Callers that constructed 
types via these constants will now produce shorter, *interoperable* 
`VARCHAR(...)` / `VARBINARY(...)` lengths.
     - Storage / wire format: a Rust-only writer that previously persisted a 
schema with the buggy `4294967295` length would have produced data unreadable 
by any Java reader, so the prior behavior was effectively unusable 
cross-language. New writes match the Java canonical value. No migration is 
required for tables actually created by paimon-java.
   
     ### Documentation
   
     None. Behavior change is invisible to users at the SQL/API surface; the 
fix only restores the documented "STRING == `VarCharType.STRING_TYPE`" 
semantics.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to