ariesdevil opened a new pull request, #3092: URL: https://github.com/apache/fory/pull/3092
…e schema evolution happy <!-- **Thanks for contributing to Apache Fory™.** **If this is your first time opening a PR on fory, you can refer to [CONTRIBUTING.md](https://github.com/apache/fory/blob/main/CONTRIBUTING.md).** Contribution Checklist - The **Apache Fory™** community has requirements on the naming of pr titles. You can also find instructions in [CONTRIBUTING.md](https://github.com/apache/fory/blob/main/CONTRIBUTING.md). - Apache Fory™ has a strong focus on performance. If the PR you submit will have an impact on performance, please benchmark it first and provide the benchmark result here. --> ### Why? This PR fixes a schema evolution issue with tuple structs. Previously, tuple struct fields were sorted by type (same as named structs), which caused schema evolution to break when adding fields of different types. For example, evolving `struct Point(f64, u8)` to `struct Point(f64, u8, f64)` would cause fields to be incorrectly matched during deserialization because the new `f64` field would be sorted before `u8`. ### What does this PR do? 1. Introduce SortedField struct: A helper struct that preserves the original field index alongside the field reference. This allows us to correctly track field positions regardless of serialization order. 2. Preserve tuple struct field order: For tuple structs, fields are no longer sorted by type. Instead, they maintain their original definition order ("0", "1", "2", ...). This ensures that field names consistently map to their positions, enabling proper schema evolution. 3. Unify protocol for tuple and named structs: Both tuple structs and named structs now use the same underlying protocol (field name based matching), but with different field name strategies: * Named structs: use field identifiers as names (sorted by type for optimal layout) * Tuple structs: use positional indices as names (unsorted to preserve schema evolution) 4. Add schema evolution tests: Comprehensive tests for tuple struct schema evolution, including: * Adding fields at the end * Removing fields from the end * Adding fields with different types (`i64`, `u8`, `f64`) ### Related issues <!-- None --> Does this PR introduce any user-facing change? [ ] Does this PR introduce any public API change? [x] Does this PR introduce any binary protocol compatibility change? **Note**: Yes, but since tuple struct support is just supported, I think no one(except me) is using this feature now : ) ### Benchmark N/A (this is a correctness fix, not a performance optimization) <!-- When the PR has an impact on performance (if you don't know whether the PR will have an impact on performance, you can submit the PR first, and if it will have impact on performance, the code reviewer will explain it), be sure to attach a benchmark data here. Delete section if not applicable. --> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
