ariesdevil opened a new pull request, #3092:
URL: https://github.com/apache/fory/pull/3092

   …e schema evolution happy
   
   <!--
   **Thanks for contributing to Apache Fory™.**
   
   **If this is your first time opening a PR on fory, you can refer to 
[CONTRIBUTING.md](https://github.com/apache/fory/blob/main/CONTRIBUTING.md).**
   
   Contribution Checklist
   
       - The **Apache Fory™** community has requirements on the naming of pr 
titles. You can also find instructions in 
[CONTRIBUTING.md](https://github.com/apache/fory/blob/main/CONTRIBUTING.md).
   
       - Apache Fory™ has a strong focus on performance. If the PR you submit 
will have an impact on performance, please benchmark it first and provide the 
benchmark result here.
   -->
   
   ### Why?
   This PR fixes a schema evolution issue with tuple structs. 
   
   Previously, tuple struct fields were sorted by type (same as named structs), 
which caused schema evolution to break when adding fields of different types.
   
   For example, evolving `struct Point(f64, u8)` to `struct Point(f64, u8, 
f64)` would cause fields to be incorrectly matched during deserialization 
because the new `f64` field would be sorted before `u8`.
   
   ### What does this PR do?
   
   1. Introduce SortedField struct: A helper struct that preserves the original 
field index alongside the field reference. This allows us to correctly track 
field positions regardless of serialization order.
   
   2. Preserve tuple struct field order: For tuple structs, fields are no 
longer sorted by type. Instead, they maintain their original definition order 
("0", "1", "2", ...). This ensures that field names consistently map to their 
positions, enabling proper schema evolution.
   
   3. Unify protocol for tuple and named structs: Both tuple structs and named 
structs now use the same underlying protocol (field name based matching), but 
with different field name strategies:
       * Named structs: use field identifiers as names (sorted by type for 
optimal layout)
       * Tuple structs: use positional indices as names (unsorted to preserve 
schema evolution)
   
   4. Add schema evolution tests: Comprehensive tests for tuple struct schema 
evolution, including:
       * Adding fields at the end
       * Removing fields from the end
       * Adding fields with different types (`i64`, `u8`, `f64`)
   
   ### Related issues
   <!-- None -->
   Does this PR introduce any user-facing change?
   [ ] Does this PR introduce any public API change?
   [x] Does this PR introduce any binary protocol compatibility change?
   **Note**: Yes, but since tuple struct support is just supported, I think no 
one(except me) is using this feature now : )
   
   ### Benchmark
   N/A (this is a correctness fix, not a performance optimization)
   
   <!--
   When the PR has an impact on performance (if you don't know whether the PR 
will have an impact on performance, you can submit the PR first, and if it will 
have impact on performance, the code reviewer will explain it), be sure to 
attach a benchmark data here.
   
   Delete section if not applicable.
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to