lzaeh opened a new pull request, #2760: URL: https://github.com/apache/fory/pull/2760
<!-- **Thanks for contributing to Apache Fory™.** **If this is your first time opening a PR on fory, you can refer to [CONTRIBUTING.md](https://github.com/apache/fory/blob/main/CONTRIBUTING.md).** Contribution Checklist - The **Apache Fory™** community has requirements on the naming of pr titles. You can also find instructions in [CONTRIBUTING.md](https://github.com/apache/fory/blob/main/CONTRIBUTING.md). - Apache Fory™ has a strong focus on performance. If the PR you submit will have an impact on performance, please benchmark it first and provide the benchmark result here. --> ## Why? Previously, my understanding of the type system was insufficient, which led to confusion between array and slice usage. As a result, the deserialization step of the slice serializer did not work properly in cross-language scenarios due to missing alignment rules (for example, when the types were structurally identical and should have taken the fast path, the previous version failed to correctly interpret the byte stream). <!-- Describe the purpose of this PR. --> ## What does this PR do? Serializer-related changes: - Adjusted the slice serializer to better align serialization and deserialization rules. - Standardized the type system to ensure that the slice serializer is consistently used where appropriate, eliminating mixed usage with array serializers. - Moved the serializer for primitive type arrays into ```primitive_array.go```. If you think it’s not appropriate, I can revert the change. Test-related changes: - Re-enabled previously commented-out test cases in commonSlice/Array. - Moved the [n]String test out of commonArray. Although it can be correctly deserialized into []String, the current test logic does not treat [n]String and []String as equivalent. - Updated several primitive array tests in TestXLangSerializer to use array types instead of slices. - Modified the check function in TestXLangSerializer to deserialize using the precise reflected type instead of interface{}. Rationale: - For example, []interface{}{"a", "b", "c"} and []string{"a", "b", "c"} are considered the same type and written via the fast path, resulting in identical byte sequences. - However, during deserialization in Go, the default behavior is to use []interface{} unless the target type is explicitly provided.(Just my opinion) - This means that if the test expects []string{}, it would fail. While we could force construction of a []T{} when the same type is detected, if the test’s starting point is []interface{}, it would still fail. Therefore, the deserialization target type is now explicitly matched via reflection. <!-- Describe the details of this PR. --> Known Issues - After pulling the latest code (which includes modifications related to field order hashing in structs), the complex struct tests that passed 10 mins ago now fail due to Python-side deserialization errors. - A. Direct testing shows identical hashes, but deserialization in Python triggers an “invalid ID (id = 0)” error in the listSerializer—possibly related to changes in this PR. - B. Simplifying the struct on both the Python and Go sides (by removing fields) caused the hash values to differ. I will follow up on this issue to determine whether the cause lies in the list serializer or incomplete handling of struct field order consistency. - Cross-language list serialization tests (serializing in one language, deserializing in another—Python ↔ Go, Java ↔ Go) revealed that Go’s int and float64 types cannot be correctly deserialized in the other languages. - All other basic types work fine, and Go can correctly deserialize bytes from both Python and Java. - When encountering floating-point values, Go must use float32 to deserialize successfully. - The slice serializer currently mimics Python’s fast-path logic, but the matching rules for integer and floating-point types are somewhat ambiguous. My understanding here may be incomplete, and this part likely needs further refinement. ## Related issues - #2633 <!-- Is there any related issue? If this PR closes them you say say fix/closes: - #xxxx0 - #xxxx1 - Fixes #xxxx2 --> ## Does this PR introduce any user-facing change? <!-- If any user-facing interface changes, please [open an issue](https://github.com/apache/fory/issues/new/choose) describing the need to do so and update the document if necessary. Delete section if not applicable. --> - [ ] Does this PR introduce any public API change? - [ ] Does this PR introduce any binary protocol compatibility change? ## Benchmark <!-- When the PR has an impact on performance (if you don't know whether the PR will have an impact on performance, you can submit the PR first, and if it will have impact on performance, the code reviewer will explain it), be sure to attach a benchmark data here. Delete section if not applicable. --> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
