This is an automated email from the ASF dual-hosted git repository. chaokunyang pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/fory-site.git
commit 5b3feacce6b266f393af3107a56885b5317b36bc Author: chaokunyang <[email protected]> AuthorDate: Fri Feb 6 07:11:55 2026 +0000 🔄 synced local 'docs/compiler/' with remote 'docs/compiler/' --- docs/compiler/compiler-guide.md | 124 +++-- docs/compiler/flatbuffers-idl.md | 201 ++++--- docs/compiler/generated-code.md | 1117 ++++++++++++++------------------------ docs/compiler/index.md | 42 +- docs/compiler/protobuf-idl.md | 565 +++++++------------ docs/compiler/schema-idl.md | 418 ++++---------- 6 files changed, 953 insertions(+), 1514 deletions(-) diff --git a/docs/compiler/compiler-guide.md b/docs/compiler/compiler-guide.md index 8bba859bb2..833ea0d3ff 100644 --- a/docs/compiler/compiler-guide.md +++ b/docs/compiler/compiler-guide.md @@ -19,7 +19,7 @@ license: | limitations under the License. --- -This guide covers installation, usage, and integration of the FDL compiler. +This guide covers installation, usage, and integration of the Fory IDL compiler. ## Installation @@ -50,22 +50,35 @@ foryc --scan-generated [OPTIONS] ### Options -| Option | Description | Default | -| ------------------------------------- | ----------------------------------------------------- | ------------- | -| `--lang` | Comma-separated target languages | `all` | -| `--output`, `-o` | Output directory | `./generated` | -| `--package` | Override package name from FDL file | (from file) | -| `-I`, `--proto_path`, `--import_path` | Add directory to import search path (can be repeated) | (none) | -| `--java_out=DST_DIR` | Generate Java code in DST_DIR | (none) | -| `--python_out=DST_DIR` | Generate Python code in DST_DIR | (none) | -| `--cpp_out=DST_DIR` | Generate C++ code in DST_DIR | (none) | -| `--go_out=DST_DIR` | Generate Go code in DST_DIR | (none) | -| `--rust_out=DST_DIR` | Generate Rust code in DST_DIR | (none) | -| `--go_nested_type_style` | Go nested type naming: `camelcase` or `underscore` | (none) | +Compile options: + +| Option | Description | Default | +| ------------------------------------- | ----------------------------------------------------- | ------------------- | +| `--lang` | Comma-separated target languages | `all` | +| `--output`, `-o` | Output directory | `./generated` | +| `--package` | Override package name from Fory IDL file | (from file) | +| `-I`, `--proto_path`, `--import_path` | Add directory to import search path (can be repeated) | (none) | +| `--java_out=DST_DIR` | Generate Java code in DST_DIR | (none) | +| `--python_out=DST_DIR` | Generate Python code in DST_DIR | (none) | +| `--cpp_out=DST_DIR` | Generate C++ code in DST_DIR | (none) | +| `--go_out=DST_DIR` | Generate Go code in DST_DIR | (none) | +| `--rust_out=DST_DIR` | Generate Rust code in DST_DIR | (none) | +| `--go_nested_type_style` | Go nested type naming: `camelcase` or `underscore` | from schema/default | +| `--emit-fdl` | Print translated Fory IDL for non-`.fdl` inputs | `false` | +| `--emit-fdl-path` | Write translated Fory IDL to a file or directory | (stdout) | + +Scan options (with `--scan-generated`): + +| Option | Description | Default | +| ------------ | ------------------------------ | ------- | +| `--root` | Root directory to scan | `.` | +| `--relative` | Print paths relative to root | `false` | +| `--delete` | Delete matched generated files | `false` | +| `--dry-run` | Scan/print only, do not delete | `false` | ### Scan Generated Files -Use `--scan-generated` to find files produced by the Fory compiler. The scanner walks +Use `--scan-generated` to find files produced by `foryc`. The scanner walks the tree recursively, skips `build/`, `target/`, and hidden directories, and prints each generated file as it is found. @@ -153,12 +166,22 @@ foryc schema.fdl --java_out=./gen/java -I proto/ -I common/ When using `--{lang}_out` options: - Only the specified languages are generated (not all languages) -- Files are placed directly in the specified directory (not in a `{lang}/` subdirectory) +- The compiler writes under the specified directory (language-specific generators may still create package/module subdirectories) - This is compatible with protoc-style workflows +**Inspect translated Fory IDL from proto/fbs input:** + +```bash +# Print translated Fory IDL to stdout +foryc schema.proto --emit-fdl + +# Write translated Fory IDL to a directory +foryc schema.fbs --emit-fdl --emit-fdl-path ./translated +``` + ## Import Path Resolution -When compiling FDL files with imports, the compiler searches for imported files in this order: +When compiling Fory IDL files with imports, the compiler searches for imported files in this order: 1. **Relative to the importing file (default)** - The directory containing the file with the import statement is always searched first, automatically. No `-I` flag needed for same-directory imports. 2. **Each `-I` path in order** - Additional search paths specified on the command line @@ -227,7 +250,7 @@ generated/ ``` - One file per type (enum or message) -- Package structure matches FDL package +- Package structure matches Fory IDL package - Registration helper class generated ### Python @@ -247,11 +270,12 @@ generated/ ``` generated/ └── go/ - └── example.go + └── example/ + └── example.go ``` - Single file with all types -- Package name from last component of FDL package +- Directory and package name are derived from `go_package` or the Fory IDL package - Registration function included ### Rust @@ -299,14 +323,11 @@ Add to your `pom.xml`: <goal>exec</goal> </goals> <configuration> - <executable>fory</executable> + <executable>foryc</executable> <arguments> - <argument>compile</argument> <argument>${project.basedir}/src/main/fdl/schema.fdl</argument> - <argument>--lang</argument> - <argument>java</argument> - <argument>--output</argument> - <argument>${project.build.directory}/generated-sources/fdl</argument> + <argument>--java_out</argument> + <argument>${project.build.directory}/generated-sources/fory</argument> </arguments> </configuration> </execution> @@ -333,7 +354,7 @@ Add generated sources: </goals> <configuration> <sources> - <source>${project.build.directory}/generated-sources/fdl</source> + <source>${project.build.directory}/generated-sources/fory</source> </sources> </configuration> </execution> @@ -349,10 +370,9 @@ Add to `build.gradle`: ```groovy task generateForyTypes(type: Exec) { - commandLine 'fory', 'compile', + commandLine 'foryc', "${projectDir}/src/main/fdl/schema.fdl", - '--lang', 'java', - '--output', "${buildDir}/generated/sources/fdl" + '--java_out', "${buildDir}/generated/sources/fory" } compileJava.dependsOn generateForyTypes @@ -360,7 +380,7 @@ compileJava.dependsOn generateForyTypes sourceSets { main { java { - srcDir "${buildDir}/generated/sources/fdl/java" + srcDir "${buildDir}/generated/sources/fory" } } } @@ -376,18 +396,17 @@ from setuptools import setup from setuptools.command.build_py import build_py import subprocess -class BuildWithFdl(build_py): +class BuildWithForyIdl(build_py): def run(self): subprocess.run([ - 'fory', 'compile', + 'foryc', 'schema.fdl', - '--lang', 'python', - '--output', 'src/generated' + '--python_out', 'src/generated' ], check=True) super().run() setup( - cmdclass={'build_py': BuildWithFdl}, + cmdclass={'build_py': BuildWithForyIdl}, # ... ) ``` @@ -417,13 +436,13 @@ use std::process::Command; fn main() { println!("cargo:rerun-if-changed=schema.fdl"); - let status = Command::new("fory") - .args(&["compile", "schema.fdl", "--lang", "rust", "--output", "src/generated"]) + let status = Command::new("foryc") + .args(&["schema.fdl", "--rust_out", "src/generated"]) .status() - .expect("Failed to run fory compiler"); + .expect("Failed to run foryc"); if !status.success() { - panic!("FDL compilation failed"); + panic!("Fory IDL compilation failed"); } } ``` @@ -433,22 +452,21 @@ fn main() { Add to `CMakeLists.txt`: ```cmake -find_program(FORY_COMPILER fory) +find_program(FORY_COMPILER foryc) add_custom_command( OUTPUT ${CMAKE_CURRENT_SOURCE_DIR}/generated/example.h - COMMAND ${FORY_COMPILER} compile + COMMAND ${FORY_COMPILER} ${CMAKE_CURRENT_SOURCE_DIR}/schema.fdl - --lang cpp - --output ${CMAKE_CURRENT_SOURCE_DIR}/generated + --cpp_out ${CMAKE_CURRENT_SOURCE_DIR}/generated DEPENDS ${CMAKE_CURRENT_SOURCE_DIR}/schema.fdl - COMMENT "Generating FDL types" + COMMENT "Generating Fory IDL types" ) -add_custom_target(generate_fdl DEPENDS ${CMAKE_CURRENT_SOURCE_DIR}/generated/example.h) +add_custom_target(generate_fory_idl DEPENDS ${CMAKE_CURRENT_SOURCE_DIR}/generated/example.h) add_library(mylib ...) -add_dependencies(mylib generate_fdl) +add_dependencies(mylib generate_fory_idl) target_include_directories(mylib PRIVATE ${CMAKE_CURRENT_SOURCE_DIR}/generated) ``` @@ -461,7 +479,7 @@ genrule( name = "generate_fdl", srcs = ["schema.fdl"], outs = ["generated/example.h"], - cmd = "$(location //:fory_compiler) compile $(SRCS) --lang cpp --output $(RULEDIR)/generated", + cmd = "$(location //:fory_compiler) $(SRCS) --cpp_out $(RULEDIR)/generated", tools = ["//:fory_compiler"], ) @@ -531,13 +549,13 @@ project/ ### Version Control -- **Track**: FDL schema files +- **Track**: Fory IDL schema files - **Ignore**: Generated code (can be regenerated) Add to `.gitignore`: ``` -# Generated FDL code +# Generated Fory IDL code src/generated/ generated/ ``` @@ -549,7 +567,7 @@ Always regenerate during builds: ```yaml # GitHub Actions example steps: - - name: Install FDL Compiler + - name: Install Fory IDL Compiler run: pip install ./compiler - name: Generate Types @@ -582,7 +600,7 @@ message User [id=100] { ### Command Not Found ``` -fory: command not found +foryc: command not found ``` **Solution:** Ensure the compiler is installed and in your PATH: @@ -613,7 +631,7 @@ chmod -R u+w ./generated <dependency> <groupId>org.apache.fory</groupId> <artifactId>fory-core</artifactId> - <version>0.14.1</version> + <version>${fory.version}</version> </dependency> ``` @@ -633,7 +651,7 @@ go get github.com/apache/fory/go/fory ```toml [dependencies] -fory = "0.13" +fory = "x.y.z" ``` **C++:** Ensure Fory headers are in include path. diff --git a/docs/compiler/flatbuffers-idl.md b/docs/compiler/flatbuffers-idl.md index 471b97a1d3..7362a1028c 100644 --- a/docs/compiler/flatbuffers-idl.md +++ b/docs/compiler/flatbuffers-idl.md @@ -19,71 +19,84 @@ license: | limitations under the License. --- -The Fory compiler can ingest FlatBuffers schemas (`.fbs`) and translate them into -Fory IR before code generation. This provides a smooth migration path when you -already have FlatBuffers schemas but want Fory-native serialization and codegen. - -## Key Differences vs FDL - -- **Field numbering**: FlatBuffers fields have no explicit IDs; Fory assigns - sequential field numbers based on declaration order, starting at 1. -- **Tables vs structs**: FlatBuffers `table` maps to a Fory message with - `evolving=true`; `struct` maps to `evolving=false`. -- **Default values**: Parsed for compatibility but ignored in generated Fory - code. Use Fory options or language defaults instead. -- **Attributes**: Metadata in `(...)` is mapped to Fory options on types and - fields; FDL uses `[option=value]` inline syntax. -- **Root type**: `root_type` is ignored because Fory does not require a root - message to serialize. -- **Unions**: FlatBuffers `union` is translated into an FDL `union`. Case IDs - follow declaration order, starting at 1. - -## Fory-Specific Attributes - -FlatBuffers attributes use `key:value` syntax. To avoid conflicts with -FlatBuffers tooling, Fory options use the `fory_` prefix and are stripped during -parsing: - -- `fory_ref:true` enables reference tracking for the field. -- `fory_nullable:true` marks the field optional. -- `fory_weak_ref:true` marks a weak reference and implies `ref`. -- `fory_thread_safe_pointer:false` selects the non-thread-safe pointer flavor - for ref fields (it does not imply `ref`). +This page explains how Apache Fory consumes FlatBuffers schemas (`.fbs`) and +translates them into Fory IR for code generation. -Example: +## What This Page Covers -```fbs -table TreeNode { - children: [TreeNode] (fory_ref: true); - parent: TreeNode (fory_weak_ref: true); -} -``` +- When to use FlatBuffers input with Fory +- Exact FlatBuffers to Fory mapping behavior +- Supported Fory-specific attributes in `.fbs` +- Migration notes and generated-code differences + +## Why Use Apache Fory + +- Idiomatic generated code: Fory generates language-idiomatic classes/structs + that can be used directly as domain objects. +- Java performance: In Java object-serialization workloads, Fory is faster than + FlatBuffers in Fory benchmarks. +- Other languages: serialization performance is generally in a similar range. +- Deserialization in practice: FlatBuffers does not perform native-object deserialization and is + faster by default, but if your application needs native objects, it requires + conversion and that conversion step can dominate read cost. In those cases, + Fory deserialization is often faster end-to-end. +- Easier APIs: Fory uses direct native objects, so you do not need to + reverse-build tables or manually manage offsets. +- Better graph modeling: Shared and circular references are first-class features + in Fory. + +## Quick Decision Guide + +| Situation | Recommended Path | +| ------------------------------------------------------------------ | ---------------------- | +| You already have `.fbs` schemas and want Fory runtime/codegen | Use FlatBuffers input | +| You are starting new schema work and want full Fory syntax control | Use native Fory IDL | +| You need FlatBuffers wire compatibility at runtime | Keep FlatBuffers stack | +| You need Fory object-graph semantics (`ref`, weak refs, etc.) | Use Fory | + +## FlatBuffers to Fory Mapping -## Scalar Type Mapping +### Schema-Level Rules -| FlatBuffers | Fory Primitive | -| ----------- | -------------- | -| `byte` | `int8` | -| `ubyte` | `uint8` | -| `short` | `int16` | -| `ushort` | `uint16` | -| `int` | `int32` | -| `uint` | `uint32` | -| `long` | `int64` | -| `ulong` | `uint64` | -| `float` | `float32` | -| `double` | `float64` | -| `bool` | `bool` | -| `string` | `string` | +- `namespace` maps to Fory package namespace. +- `include` entries map to Fory imports. +- `table` is translated as `evolving=true`. +- `struct` is translated as `evolving=false`. +- `root_type` is parsed but ignored by Fory runtime/codegen. +- `file_identifier` and `file_extension` are parsed but not used by Fory codegen. -Vectors (`[T]`) map to Fory list types. +### Field Numbering -## Union Mapping +FlatBuffers fields do not have explicit field IDs. Fory assigns field numbers by +source declaration order, starting at `1`. -FlatBuffers unions are converted to FDL unions and then to native union APIs in -each target language. +### Scalar Type Mapping -**FlatBuffers:** +| FlatBuffers | Fory Type | +| ----------- | --------- | +| `byte` | `int8` | +| `ubyte` | `uint8` | +| `short` | `int16` | +| `ushort` | `uint16` | +| `int` | `int32` | +| `uint` | `uint32` | +| `long` | `int64` | +| `ulong` | `uint64` | +| `float` | `float32` | +| `double` | `float64` | +| `bool` | `bool` | +| `string` | `string` | + +Vectors (`[T]`) map to Fory lists. + +### Unions + +FlatBuffers unions map to Fory unions. + +- Case IDs are assigned by declaration order, starting at `1`. +- Case names are derived from type names using snake_case field naming. + +**FlatBuffers** ```fbs union Payload { @@ -96,7 +109,7 @@ table Container { } ``` -**FDL (conceptual):** +**Fory shape after translation** ```protobuf union Payload { @@ -109,9 +122,52 @@ message Container { } ``` -Case IDs are derived from the declaration order in the `union`. The generated -case names are based on the type names (converted to each language's naming -convention). +### Defaults and Metadata + +- FlatBuffers default values are parsed but not applied as Fory runtime defaults. +- Non-Fory metadata attributes are preserved as generic options in IR and may be + consumed by downstream tooling. + +## Fory-Specific Attributes in FlatBuffers + +FlatBuffers metadata attributes use `key:value`. For Fory-specific options, use +`fory_` (or `fory.`) prefix in `.fbs`; the prefix is removed during parsing. + +### Supported Field Attributes + +| FlatBuffers Attribute | Effect in Fory | +| -------------------------------- | ----------------------------------------------------- | +| `fory_ref:true` | Enable reference tracking for the field | +| `fory_nullable:true` | Mark field optional/nullable | +| `fory_weak_ref:true` | Enable weak reference semantics and implies `ref` | +| `fory_thread_safe_pointer:false` | For ref fields, select non-thread-safe pointer flavor | + +Semantics: + +- `fory_weak_ref:true` implies `ref`. +- `fory_thread_safe_pointer` only takes effect when the field is ref-tracked. +- For list fields, `fory_ref:true` applies to list elements. + +Example: + +```fbs +table Node { + parent: Node (fory_weak_ref: true); + children: [Node] (fory_ref: true); + cached: Node (fory_ref: true, fory_thread_safe_pointer: false); +} +``` + +## Generated Code Differences + +Using `.fbs` as input to Fory still produces normal Fory-generated code, not +FlatBuffers `ByteBuffer`-style APIs. + +- Java: POJOs/records with Fory metadata +- Python: dataclasses plus registration helpers +- Go/Rust/C++: native structs and Fory metadata + +The serialization format is Fory binary protocol, not FlatBuffers wire format. ## Usage @@ -121,22 +177,23 @@ Compile a FlatBuffers schema directly: foryc schema.fbs --lang java,python --output ./generated ``` -To inspect the translated FDL for debugging: +Inspect translated schema syntax for debugging: ```bash foryc schema.fbs --emit-fdl --emit-fdl-path ./translated ``` -## Generated Code Differences +## Migration Notes -FlatBuffers-generated APIs are centered around `ByteBuffer` accessors and -builders. Fory code generation instead produces native language structures and -registration helpers, the same as when compiling FDL: +1. Keep existing `namespace` values stable to keep type registration stable. +2. Review fields that relied on FlatBuffers default literals and set explicit + defaults in application code if needed. +3. Add `fory_ref`/`fory_weak_ref` where object-graph semantics are required. +4. Validate generated model behavior with roundtrip tests before replacing + existing serialization paths. -- **Java**: Plain POJOs with Fory annotations. -- **Python**: Dataclasses with registration helpers. -- **Go/Rust/C++**: Native structs with Fory metadata. +## Summary -Because Fory generates native types, the resulting APIs are different from -FlatBuffers builder/accessor APIs, and the serialization format is Fory's binary -protocol rather than the FlatBuffers wire format. +FlatBuffers input lets you reuse existing `.fbs` schemas while moving to Fory's +runtime and code generation model. This is useful for incremental migration, +while preserving schema investment and adopting Fory-native object APIs. diff --git a/docs/compiler/generated-code.md b/docs/compiler/generated-code.md index e90942510e..225576d85a 100644 --- a/docs/compiler/generated-code.md +++ b/docs/compiler/generated-code.md @@ -19,185 +19,45 @@ license: | limitations under the License. --- -This document explains the code generated by the FDL compiler for each target language. +This document explains generated code for each target language. -## Example Schema +Fory IDL generated types are idiomatic in host languages and can be used directly as domain objects. Generated types also include `to/from bytes` helpers and registration helpers. -The examples in this document use this FDL schema: +All snippets are representative excerpts from real generated output. + +## Example Fory IDL Schema + +The sections below use this schema: ```protobuf package demo; -enum Status [id=100] { - PENDING = 0; - ACTIVE = 1; - COMPLETED = 2; +enum DeviceTier [id=100] { + DEVICE_TIER_UNKNOWN = 0; + DEVICE_TIER_TIER1 = 1; + DEVICE_TIER_TIER2 = 2; } message User [id=101] { string id = 1; string name = 2; optional string email = 3; - int32 age = 4; -} - -message Order [id=102] { - string id = 1; - ref User customer = 2; - list<string> items = 3; - map<string, int32> quantities = 4; - Status status = 5; } -``` - -## Enum Prefix Stripping - -When enum values use a protobuf-style prefix (enum name in UPPER_SNAKE_CASE), the compiler automatically strips the prefix for languages with scoped enums. This produces cleaner, more idiomatic code. - -**Input FDL:** - -```protobuf -enum DeviceTier { - DEVICE_TIER_UNKNOWN = 0; - DEVICE_TIER_TIER1 = 1; - DEVICE_TIER_TIER2 = 2; -} -``` - -**Generated output by language:** -| Language | Generated Values | Notes | -| -------- | ----------------------------------------- | ------------------------- | -| Java | `UNKNOWN, TIER1, TIER2` | Scoped enum | -| Rust | `Unknown, Tier1, Tier2` | PascalCase variants | -| C++ | `UNKNOWN, TIER1, TIER2` | Scoped `enum class` | -| Python | `UNKNOWN, TIER1, TIER2` | Scoped `IntEnum` | -| Go | `DeviceTierUnknown, DeviceTierTier1, ...` | Unscoped, prefix re-added | - -**Note:** Go uses unscoped constants, so the enum name prefix is added back to avoid naming collisions. - -## Nested Types - -When using nested message and enum definitions, the generated code varies by language. - -**Input FDL:** - -```protobuf -message SearchResponse { - message Result { +message SearchResponse [id=102] { + message Result [id=103] { string url = 1; string title = 2; } list<Result> results = 1; } -``` - -### Java - Inner Classes - -```java -public class SearchResponse { - public static class Result { - private String url; - private String title; - // getters, setters... - } - - private List<Result> results; - // getters, setters... -} -``` - -### Python - Nested Classes - -```python -@dataclass -class SearchResponse: - @dataclass - class Result: - url: str = "" - title: str = "" - - results: List[Result] = field(default_factory=list) -``` - -### Go - Underscore - -```go -type SearchResponse_Result struct { - Url string - Title string -} - -type SearchResponse struct { - Results []SearchResponse_Result -} -``` - -**Note:** Set `option (fory).go_nested_type_style = "camelcase";` to generate `SearchResponseResult` instead. - -### Rust - Nested Module - -```rust -pub mod search_response { - use super::*; - - #[derive(ForyObject)] - pub struct Result { - pub url: String, - pub title: String, - } -} - -#[derive(ForyObject)] -pub struct SearchResponse { - pub results: Vec<search_response::Result>, -} -``` - -### C++ - Nested Classes - -```cpp -class SearchResponse final { - public: - class Result final { - public: - std::string url; - std::string title; - }; - - std::vector<Result> results; -}; - -FORY_STRUCT(SearchResponse::Result, url, title); -FORY_STRUCT(SearchResponse, results); -``` - -**Summary:** - -| Language | Approach | Syntax Example | -| -------- | ------------------------- | ------------------------- | -| Java | Static inner classes | `SearchResponse.Result` | -| Python | Nested dataclasses | `SearchResponse.Result` | -| Go | Underscore (configurable) | `SearchResponse_Result` | -| Rust | Nested module | `search_response::Result` | -| C++ | Nested classes | `SearchResponse::Result` | - -## Union Generation - -FDL unions generate type-safe APIs with an explicit active case. This example is -based on `integration_tests/idl_tests/idl/addressbook.fdl`: - -```protobuf -package addressbook; message Dog [id=104] { string name = 1; - int32 bark_volume = 2; } message Cat [id=105] { string name = 1; - int32 lives = 2; } union Animal [id=106] { @@ -205,734 +65,589 @@ union Animal [id=106] { Cat cat = 2; } -message Person [id=100] { - Animal pet = 8; -} -``` - -### Java - -```java -Animal pet = Animal.ofDog(new Dog()); -if (pet.hasDog()) { - Dog dog = pet.getDog(); -} -Animal.AnimalCase caseId = pet.getAnimalCase(); -``` - -### Python - -```python -pet = Animal.dog(Dog(name="Rex", bark_volume=5)) -if pet.is_dog(): - dog = pet.dog_value() -case_id = pet.case_id() -``` - -### Go - -```go -pet := DogAnimal(&Dog{Name: "Rex", BarkVolume: 5}) -if dog, ok := pet.AsDog(); ok { - _ = dog +message Order [id=107] { + string id = 1; + ref User customer = 2; + list<string> items = 3; + map<string, int32> quantities = 4; + DeviceTier tier = 5; + Animal pet = 6; } -_ = pet.Visit(AnimalVisitor{ - Dog: func(d *Dog) error { return nil }, -}) ``` -### Rust - -```rust -let pet = Animal::Dog(Dog { - name: "Rex".into(), - bark_volume: 5, -}); -``` - -### C++ +## Java -```cpp -addressbook::Animal pet = addressbook::Animal::dog( - addressbook::Dog{"Rex", 5}); -if (pet.is_dog()) { - const addressbook::Dog& dog = pet.dog(); -} -``` +### Output Layout -Generated registration helpers also register union types, for example: +For package `demo`, Java code is generated under `demo/`: -- Java: `fory.registerUnion(Animal.class, 106, new UnionSerializer(...))` -- Python: `fory.register_union(Animal, type_id=106, serializer=AnimalSerializer(fory))` -- Go: `f.RegisterUnion(...)` -- Rust: `fory.register_union::<Animal>(106)?` -- C++: `FORY_UNION(addressbook::Animal, ...)` +- `DeviceTier.java`, `User.java`, `SearchResponse.java`, `Dog.java`, `Cat.java`, `Animal.java`, `Order.java` +- `DemoForyRegistration.java` -## Java +### Type Generation -### Enum Generation +Enum prefix stripping keeps scoped enum values clean: ```java -package demo; - -public enum Status { - PENDING, - ACTIVE, - COMPLETED; +public enum DeviceTier { + UNKNOWN, + TIER1, + TIER2; } ``` -### Message Generation +Messages are regular Java classes with `@ForyField` metadata and Java-style getters/setters: ```java -package demo; - -import java.util.List; -import java.util.Map; -import org.apache.fory.annotation.ForyField; - -public class User { +public class Order { + @ForyField(id = 1) private String id; - private String name; - - @ForyField(nullable = true) - private String email; - private int age; - - public User() { - } - - public String getId() { - return id; - } - - public void setId(String id) { - this.id = id; - } + @ForyField(id = 2, nullable = true, ref = true) + private User customer; - public String getName() { - return name; - } + @ForyField(id = 3) + private List<String> items; - public void setName(String name) { - this.name = name; - } + @ForyField(id = 4) + private Map<String, Integer> quantities; - public String getEmail() { - return email; - } + @ForyField(id = 5) + private DeviceTier tier; - public void setEmail(String email) { - this.email = email; - } + @ForyField(id = 6) + private Animal pet; - public int getAge() { - return age; - } + public String getId() { ... } + public void setId(String id) { ... } + public User getCustomer() { ... } + public void setCustomer(User customer) { ... } - public void setAge(int age) { - this.age = age; - } + public byte[] toBytes() { ... } + public static Order fromBytes(byte[] bytes) { ... } } ``` -```java -package demo; - -import java.util.List; -import java.util.Map; -import org.apache.fory.annotation.ForyField; - -public class Order { - private String id; - - @ForyField(ref = true) - private User customer; - - private List<String> items; - private Map<String, Integer> quantities; - private Status status; - - public Order() { - } +Nested messages become static inner classes: - // Getters and setters... +```java +public class SearchResponse { + public static class Result { ... } } ``` -### Registration Helper +Unions generate explicit case APIs: ```java -package demo; - -import org.apache.fory.Fory; -import org.apache.fory.ThreadSafeFory; -import org.apache.fory.pool.SimpleForyPool; - -public class DemoForyRegistration { +Animal pet = Animal.ofDog(new Dog()); +if (pet.hasDog()) { + Dog dog = pet.getDog(); +} +Animal.AnimalCase c = pet.getAnimalCase(); +int caseId = pet.getAnimalCaseId(); +``` - private static ThreadSafeFory createFory() { - ThreadSafeFory fory = new SimpleForyPool(c -> Fory.builder() - .withXlang(true) - .withRefTracking(true) - .build()); - fory.registerCallback(f -> registerAllTypes(f)); - return fory; - } +### Registration - private static void registerAllTypes(Fory fory) { - register(fory); - } +Generated registration helper: +```java +public class DemoForyRegistration { public static void register(Fory fory) { - fory.register(Status.class, 100); - fory.register(User.class, 101); - fory.register(Order.class, 102); + org.apache.fory.resolver.TypeResolver resolver = fory.getTypeResolver(); + resolver.register(DeviceTier.class, 100L); + resolver.registerUnion( + Animal.class, + 106L, + new org.apache.fory.serializer.UnionSerializer(fory, Animal.class)); + resolver.register(User.class, 101L); + resolver.register(SearchResponse.class, 102L); + resolver.register(SearchResponse.Result.class, 103L); + resolver.register(Dog.class, 104L); + resolver.register(Cat.class, 105L); + resolver.register(Order.class, 107L); } } ``` -`register` only contains types defined in the current file. The generated -`registerAllTypes` registers imported types first and then calls `register`. +If you disable auto IDs (`option enable_auto_type_id = false;`), registration switches to namespace + type name: + +```java +resolver.register(Config.class, "myapp.models", "Config"); +resolver.registerUnion( + Holder.class, + "myapp.models", + "Holder", + new org.apache.fory.serializer.UnionSerializer(fory, Holder.class)); +``` ### Usage ```java -import demo.*; - -public class Example { - public static void main(String[] args) { - User user = new User(); - user.setId("u123"); - user.setName("Alice"); - user.setAge(30); - - Order order = new Order(); - order.setId("o456"); - order.setCustomer(user); - order.setStatus(Status.ACTIVE); - - byte[] bytes = order.toBytes(); - Order restored = Order.fromBytes(bytes); - } -} +Order order = new Order(); +order.setId("o456"); +order.setCustomer(new User()); +order.setTier(DeviceTier.TIER1); +order.setPet(Animal.ofDog(new Dog())); + +byte[] bytes = order.toBytes(); +Order restored = Order.fromBytes(bytes); ``` ## Python -### Module Generation +### Output Layout + +One module is generated per package, for example `demo.py`. + +### Type Generation + +Enums are `IntEnum` values with prefix stripping: ```python -# Licensed to the Apache Software Foundation (ASF)... +class DeviceTier(IntEnum): + UNKNOWN = 0 + TIER1 = 1 + TIER2 = 2 +``` -from dataclasses import dataclass -from enum import IntEnum -from typing import Dict, List, Optional -import pyfory +Messages are `@pyfory.dataclass` classes: + +```python [email protected](repr=False) +class Order: + id: str = pyfory.field(id=1, default="") + customer: Optional[User] = pyfory.field(id=2, nullable=True, ref=True, default=None) + items: List[str] = pyfory.field(id=3, default_factory=list) + quantities: Dict[str, pyfory.int32] = pyfory.field(id=4, default_factory=dict) + tier: DeviceTier = pyfory.field(id=5, default=None) + pet: Animal = pyfory.field(id=6, default=None) + def to_bytes(self) -> bytes: ... + @classmethod + def from_bytes(cls, data: bytes) -> "Order": ... +``` -class Status(IntEnum): - PENDING = 0 - ACTIVE = 1 - COMPLETED = 2 +Nested messages stay nested: +```python [email protected] +class SearchResponse: + @pyfory.dataclass + class Result: + url: str = pyfory.field(id=1, default="") + title: str = pyfory.field(id=2, default="") +``` -@dataclass -class User: - id: str = "" - name: str = "" - email: Optional[str] = None - age: pyfory.int32 = 0 +Unions generate case enum + typed accessors: +```python +pet = Animal.dog(Dog(name="Rex")) +if pet.is_dog(): + dog = pet.dog_value() +case_id = pet.case_id() +``` -@dataclass -class Order: - id: str = "" - customer: Optional[User] = None - items: List[str] = None - quantities: Dict[str, pyfory.int32] = None - status: Status = None +### Registration +Generated registration function: +```python def register_demo_types(fory: pyfory.Fory): - fory.register_type(Status, type_id=100) + fory.register_type(DeviceTier, type_id=100) + fory.register_union(Animal, type_id=106, serializer=AnimalSerializer(fory)) fory.register_type(User, type_id=101) - fory.register_type(Order, type_id=102) + fory.register_type(SearchResponse, type_id=102) + fory.register_type(SearchResponse.Result, type_id=103) + fory.register_type(Dog, type_id=104) + fory.register_type(Cat, type_id=105) + fory.register_type(Order, type_id=107) +``` +If auto IDs are disabled: -def _register_all_types(fory: pyfory.Fory): - register_demo_types(fory) +```python +fory.register_type(Config, namespace="myapp.models", typename="Config") +fory.register_union( + Holder, + namespace="myapp.models", + typename="Holder", + serializer=HolderSerializer(fory), +) ``` -`register_demo_types` only contains types defined in the current file. The -generated `_register_all_types` registers imported types first and then calls -`register_demo_types`. - ### Usage ```python -from demo import User, Order, Status - -user = User(id="u123", name="Alice", age=30) order = Order( id="o456", - customer=user, - items=["item1", "item2"], - quantities={"item1": 2, "item2": 1}, - status=Status.ACTIVE + customer=User(id="u1", name="Alice"), + items=["a", "b"], + quantities={"a": 1, "b": 2}, + tier=DeviceTier.TIER1, + pet=Animal.dog(Dog(name="Rex")), ) data = order.to_bytes() restored = Order.from_bytes(data) ``` -## Go - -### File Generation - -```go -// Licensed to the Apache Software Foundation (ASF)... +## Rust -package demo +### Output Layout -import ( - fory "github.com/apache/fory/go/fory" -) +One Rust module file per package, for example `demo.rs`. -type Status int32 +### Type Generation -const ( - StatusPending Status = 0 - StatusActive Status = 1 - StatusCompleted Status = 2 -) +Enums are strongly typed and use stripped, idiomatic variant names: -type User struct { - Id string - Name string - Email *string `fory:"nullable"` - Age int32 -} - -type Order struct { - Id string - Customer *User `fory:"ref"` - Items []string - Quantities map[string]int32 - Status Status +```rust +#[derive(ForyObject, Debug, Clone, PartialEq, Default)] +#[repr(i32)] +pub enum DeviceTier { + #[default] + Unknown = 0, + Tier1 = 1, + Tier2 = 2, } +``` -func RegisterTypes(f *fory.Fory) error { - if err := f.RegisterEnum(Status(0), 100); err != nil { - return err - } - if err := f.Register(User{}, 101); err != nil { - return err - } - if err := f.Register(Order{}, 102); err != nil { - return err - } - return nil -} +Messages derive `ForyObject`: -func registerAllTypes(f *fory.Fory) error { - if err := RegisterTypes(f); err != nil { - return err - } - return nil +```rust +#[derive(ForyObject, Clone, PartialEq, Default)] +pub struct Order { + #[fory(id = 1)] + pub id: String, + #[fory(id = 2, nullable = true, ref = true)] + pub customer: Option<Arc<User>>, + #[fory(id = 3)] + pub items: Vec<String>, + #[fory(id = 4)] + pub quantities: HashMap<String, i32>, + #[fory(id = 5)] + pub tier: DeviceTier, + #[fory(id = 6, type_id = "union")] + pub pet: Animal, } ``` -`RegisterTypes` only contains types defined in the current file. The generated -`registerAllTypes` registers imported types first and then calls `RegisterTypes`. - -### Usage - -```go -package main - -import ( - "demo" -) +Nested types are generated in nested modules: -func main() { - email := "[email protected]" - user := &demo.User{ - Id: "u123", - Name: "Alice", - Email: &email, - Age: 30, - } - - order := &demo.Order{ - Id: "o456", - Customer: user, - Items: []string{"item1", "item2"}, - Quantities: map[string]int32{ - "item1": 2, - "item2": 1, - }, - Status: demo.StatusActive, - } - bytes, err := order.ToBytes() - if err != nil { - panic(err) - } - var restored demo.Order - if err := restored.FromBytes(bytes); err != nil { - panic(err) - } +```rust +pub mod search_response { + #[derive(ForyObject, Debug, Clone, PartialEq, Default)] + pub struct Result { ... } } ``` -## Rust - -### Module Generation +Unions map to Rust enums with per-case IDs: ```rust -// Licensed to the Apache Software Foundation (ASF)... - -use fory::{Fory, ForyObject}; -use std::collections::HashMap; -use std::sync::Arc; - -#[derive(ForyObject, Debug, Clone, PartialEq, Default)] -#[repr(i32)] -pub enum Status { - #[default] - Pending = 0, - Active = 1, - Completed = 2, +#[derive(ForyObject, Debug, Clone, PartialEq)] +pub enum Animal { + #[fory(id = 1)] + Dog(Dog), + #[fory(id = 2)] + Cat(Cat), } +``` -#[derive(ForyObject, Debug, Clone, PartialEq, Default)] -pub struct User { - pub id: String, - pub name: String, - #[fory(nullable = true)] - pub email: Option<String>, - pub age: i32, -} +### Registration -#[derive(ForyObject, Debug, Clone, PartialEq, Default)] -pub struct Order { - pub id: String, - pub customer: Arc<User>, - pub items: Vec<String>, - pub quantities: HashMap<String, i32>, - pub status: Status, -} +Generated registration function: +```rust pub fn register_types(fory: &mut Fory) -> Result<(), fory::Error> { - fory.register::<Status>(100)?; + fory.register::<DeviceTier>(100)?; + fory.register_union::<Animal>(106)?; fory.register::<User>(101)?; - fory.register::<Order>(102)?; - Ok(()) -} - -fn register_all_types(fory: &mut Fory) -> Result<(), fory::Error> { - register_types(fory)?; + fory.register::<search_response::Result>(103)?; + fory.register::<SearchResponse>(102)?; + fory.register::<Dog>(104)?; + fory.register::<Cat>(105)?; + fory.register::<Order>(107)?; Ok(()) } ``` -`register_types` only contains types defined in the current file. The generated -`register_all_types` registers imported types first and then calls -`register_types`. +If auto IDs are disabled: -**Note:** Rust uses `Arc` by default for `ref` fields. In FDL, use -`ref(thread_safe=false)` to generate `Rc`, and `ref(weak=true)` to generate -`ArcWeak`/`RcWeak`. For protobuf/IDL extensions, use -`[(fory).thread_safe_pointer = false]` and `[(fory).weak_ref = true]`. +```rust +fory.register_by_namespace::<Config>("myapp.models", "Config")?; +fory.register_union_by_namespace::<Holder>("myapp.models", "Holder")?; +``` ### Usage ```rust -use demo::{User, Order, Status}; -use std::sync::Arc; -use std::collections::HashMap; - -fn main() -> Result<(), fory::Error> { - let user = Arc::new(User { - id: "u123".to_string(), - name: "Alice".to_string(), - email: Some("[email protected]".to_string()), - age: 30, - }); - - let mut quantities = HashMap::new(); - quantities.insert("item1".to_string(), 2); - quantities.insert("item2".to_string(), 1); - - let order = Order { - id: "o456".to_string(), - customer: user, - items: vec!["item1".to_string(), "item2".to_string()], - quantities, - status: Status::Active, - }; - - let bytes = order.to_bytes()?; - let restored = Order::from_bytes(&bytes)?; +let order = Order { + id: "o456".into(), + customer: Some(Arc::new(User::default())), + items: vec!["a".into(), "b".into()], + quantities: HashMap::new(), + tier: DeviceTier::Tier1, + pet: Animal::Dog(Dog { name: "Rex".into() }), +}; - Ok(()) -} +let bytes = order.to_bytes()?; +let restored = Order::from_bytes(&bytes)?; ``` ## C++ -### Header Generation +### Output Layout + +One header per package, for example `demo.h`. + +### Type Generation + +Enums are generated as `enum class` with stripped names: ```cpp -/* - * Licensed to the Apache Software Foundation (ASF)... - */ - -#ifndef DEMO_H_ -#define DEMO_H_ - -#include <cstdint> -#include <map> -#include <memory> -#include <optional> -#include <string> -#include <vector> -#include "fory/serialization/fory.h" - -namespace demo { - -struct User; -struct Order; - -enum class Status : int32_t { - PENDING = 0, - ACTIVE = 1, - COMPLETED = 2, +enum class DeviceTier : int32_t { + UNKNOWN = 0, + TIER1 = 1, + TIER2 = 2, }; -FORY_ENUM(Status, PENDING, ACTIVE, COMPLETED); +FORY_ENUM(demo::DeviceTier, UNKNOWN, TIER1, TIER2); +``` -struct User { - std::string id; - std::string name; - std::optional<std::string> email; - int32_t age; +Messages are generated as classes with typed accessors and private fields, including `has_xxx`, `mutable_xxx`, and `set_xxx` where applicable: - bool operator==(const User& other) const { - return id == other.id && name == other.name && - email == other.email && age == other.age; - } -}; -FORY_STRUCT(User, id, name, email, age); - -struct Order { - std::string id; - std::shared_ptr<User> customer; - std::vector<std::string> items; - std::map<std::string, int32_t> quantities; - Status status; - - bool operator==(const Order& other) const { - return id == other.id && customer == other.customer && - items == other.items && quantities == other.quantities && - status == other.status; - } +```cpp +class Order final { + public: + const std::string& id() const; + std::string* mutable_id(); + template <class Arg, class... Args> + void set_id(Arg&& arg, Args&&... args); + + bool has_customer() const; + const std::shared_ptr<User>& customer() const; + std::shared_ptr<User>* mutable_customer(); + void set_customer(std::shared_ptr<User> value); + void clear_customer(); + + fory::Result<std::vector<uint8_t>, fory::Error> to_bytes() const; + static fory::Result<Order, fory::Error> from_bytes( + const std::vector<uint8_t>& data); + + private: + std::string id_; + std::shared_ptr<User> customer_; + std::vector<std::string> items_; + std::map<std::string, int32_t> quantities_; + DeviceTier tier_; + Animal pet_; + + public: + FORY_STRUCT(Order, id_, customer_, items_, quantities_, tier_, pet_); }; -FORY_STRUCT(Order, id, customer, items, quantities, status); +``` -inline void register_types(fory::serialization::BaseFory& fory) { - fory.register_enum<Status>(100); - fory.register_struct<User>(101); - fory.register_struct<Order>(102); -} +Nested messages are nested classes: -inline void register_all_types(fory::serialization::BaseFory& fory) { - register_types(fory); -} +```cpp +class SearchResponse final { + public: + class Result final { ... }; +}; +``` -} // namespace demo +Unions are generated as tagged `std::variant` wrappers: -#endif // DEMO_H_ +```cpp +Animal pet = Animal::dog(Dog{}); +if (pet.is_dog()) { + const Dog& dog = pet.dog(); +} +uint32_t case_id = pet.animal_case_id(); ``` -`register_types` only contains types defined in the current file. The generated -`register_all_types` registers imported types first and then calls -`register_types`. +### Registration -### Usage +Generated registration function: ```cpp -#include "demo.h" -#include <iostream> - -int main() { - auto user = std::make_shared<demo::User>(); - user->id = "u123"; - user->name = "Alice"; - user->email = "[email protected]"; - user->age = 30; - - demo::Order order; - order.id = "o456"; - order.customer = user; - order.items = {"item1", "item2"}; - order.quantities = {{"item1", 2}, {"item2", 1}}; - order.status = demo::Status::ACTIVE; - - auto bytes = order.to_bytes(); - auto restored = demo::Order::from_bytes(bytes.value()); - - return 0; +inline void register_types(fory::serialization::BaseFory& fory) { + fory.register_enum<DeviceTier>(100); + fory.register_union<Animal>(106); + fory.register_struct<User>(101); + fory.register_struct<SearchResponse::Result>(103); + fory.register_struct<SearchResponse>(102); + fory.register_struct<Dog>(104); + fory.register_struct<Cat>(105); + fory.register_struct<Order>(107); } ``` -**Note:** C++ uses `std::shared_ptr<T>` for `ref` fields. Set -`ref(weak=true)` in FDL (or `[(fory).weak_ref = true]` in protobuf) to generate -`fory::serialization::SharedWeak<T>` for weak references. +If auto IDs are disabled: -## Generated Annotations Summary - -### Java Annotations +```cpp +fory.register_struct<Config>("myapp.models", "Config"); +fory.register_union<Holder>("myapp.models", "Holder"); +``` -| Annotation | Purpose | -| ----------------------------- | -------------------------- | -| `@ForyField(nullable = true)` | Marks field as nullable | -| `@ForyField(ref = true)` | Enables reference tracking | +### Usage -### Python Type Hints +```cpp +demo::Order order; +order.set_id("o456"); +order.set_customer(std::make_shared<demo::User>()); -| Hint | Purpose | -| -------------- | ------------------- | -| `Optional[T]` | Nullable field | -| `List[T]` | Repeated field | -| `Dict[K, V]` | Map field | -| `pyfory.int32` | Fixed-width integer | +auto bytes_result = order.to_bytes(); +if (!bytes_result.ok()) { + return 1; +} +auto order_result = demo::Order::from_bytes(bytes_result.value()); +if (!order_result.ok()) { + return 1; +} +demo::Order restored = std::move(order_result.value()); +``` -### Go Struct Tags +## Go -| Tag | Purpose | -| ----------------- | -------------------------- | -| `fory:"nullable"` | Marks field as nullable | -| `fory:"ref"` | Enables reference tracking | +### Output Layout -### Rust Attributes +Go output path depends on whether `go_package` is configured. -| Attribute | Purpose | -| -------------------------- | -------------------------- | -| `#[derive(ForyObject)]` | Enables Fory serialization | -| `#[fory(nullable = true)]` | Marks field as nullable | -| `#[repr(i32)]` | Enum representation | +When `go_package` is set in schema options (as in `integration_tests/idl_tests/idl/addressbook.fdl`), output follows that package path, for example: -### C++ Macros +- `integration_tests/idl_tests/go/addressbook/generated/addressbook.go` -| Macro | Purpose | -| ---------------------------- | ----------------------- | -| `FORY_STRUCT(T[, fields..])` | Registers struct fields | -| `FORY_ENUM(T, values..)` | Registers enum values | +Without `go_package`, compiler derives output from the Fory IDL package name. -## Name-Based Registration +For package `demo`, output is: -When types don't have explicit type IDs and `enable_auto_type_id = false`, they use -namespace-based registration: +- `<go_out>/demo/demo.go` -### FDL +For package `myapp.models`, output is: -```protobuf -package myapp.models; +- `<go_out>/models/myapp_models.go` -message Config { // No @id - string key = 1; - string value = 2; -} -``` +### Type Generation -### Generated Registration +Enums keep Go-style unscoped constant names: -**Java:** +```go +type DeviceTier int32 -```java -fory.register(Config.class, "myapp.models", "Config"); +const ( + DeviceTierUnknown DeviceTier = 0 + DeviceTierTier1 DeviceTier = 1 + DeviceTierTier2 DeviceTier = 2 +) ``` -**Python:** +Messages are regular structs with fory tags: -```python -fory.register_type(Config, namespace="myapp.models", typename="Config") +```go +type User struct { + Id string `fory:"id=1"` + Name string `fory:"id=2"` + Email optional.Optional[string] `fory:"id=3,nullable"` +} + +type Order struct { + Id string `fory:"id=1"` + Customer *User `fory:"id=2,nullable,ref"` + Items []string `fory:"id=3"` + Quantities map[string]int32 `fory:"id=4"` + Tier DeviceTier `fory:"id=5"` + Pet Animal `fory:"id=6"` +} ``` -**Go:** +Nested type naming defaults to underscore: ```go -f.RegisterTagType("myapp.models.Config", Config{}) +type SearchResponse_Result struct { ... } ``` -**Rust:** +You can switch to concatenated names with: -```rust -fory.register_by_namespace::<Config>("myapp.models", "Config")?; +```protobuf +option go_nested_type_style = "camelcase"; ``` -**C++:** +Unions generate typed case helpers: -```cpp -fory.register_struct<Config>("myapp.models", "Config"); +```go +pet := DogAnimal(&Dog{Name: "Rex"}) +if dog, ok := pet.AsDog(); ok { + _ = dog +} +_ = pet.Visit(AnimalVisitor{ + Dog: func(d *Dog) error { return nil }, +}) ``` -## Customization - -### Extending Generated Code - -Generated code can be extended through language-specific mechanisms: +### Registration -**Java:** Use inheritance or composition: +Generated registration function: -```java -public class ExtendedUser extends User { - public String getDisplayName() { - return getName() + " <" + getEmail() + ">"; +```go +func RegisterTypes(f *fory.Fory) error { + if err := f.RegisterEnum(DeviceTier(0), 100); err != nil { + return err + } + if err := f.RegisterUnion(Animal{}, 106, ...); err != nil { + return err + } + if err := f.RegisterStruct(User{}, 101); err != nil { + return err } + // ... SearchResponse_Result, SearchResponse, Dog, Cat, Order + return nil } ``` -**Python:** Add methods after import: - -```python -from demo import User +If auto IDs are disabled: -def get_display_name(self): - return f"{self.name} <{self.email}>" - -User.get_display_name = get_display_name +```go +if err := f.RegisterNamedStruct(Config{}, "myapp.models.Config"); err != nil { ... } +if err := f.RegisterNamedUnion(Holder{}, "myapp.models.Holder", ...); err != nil { ... } ``` -**Go:** Use separate file in same package: +### Usage ```go -package demo +email := optional.Some("[email protected]") +order := &Order{ + Id: "o456", + Customer: &User{Id: "u1", Name: "Alice", Email: email}, + Items: []string{"a", "b"}, + Tier: DeviceTierTier1, + Pet: DogAnimal(&Dog{Name: "Rex"}), +} -func (u *User) DisplayName() string { - return u.Name + " <" + *u.Email + ">" +data, err := order.ToBytes() +if err != nil { + panic(err) +} +var restored Order +if err := restored.FromBytes(data); err != nil { + panic(err) } ``` -**Rust:** Use trait extensions: +## Cross-Language Notes -```rust -trait UserExt { - fn display_name(&self) -> String; -} +### Type ID Behavior -impl UserExt for User { - fn display_name(&self) -> String { - format!("{} <{}>", self.name, self.email.as_deref().unwrap_or("")) - } -} -``` +- Explicit `[id=...]` is used directly. +- Without explicit IDs, compiler-generated IDs are used by default. +- With `option enable_auto_type_id = false;`, generated code registers by namespace + type name. -**C++:** Use inheritance or free functions: +### Nested Type Shapes -```cpp -std::string display_name(const demo::User& user) { - return user.name + " <" + user.email.value_or("") + ">"; -} -``` +| Language | Nested Type Form | +| -------- | ----------------------- | +| Java | `Outer.Inner` | +| Python | `Outer.Inner` | +| Rust | `outer::Inner` | +| C++ | `Outer::Inner` | +| Go | `Outer_Inner` (default) | diff --git a/docs/compiler/index.md b/docs/compiler/index.md index fa922056fc..0038018790 100644 --- a/docs/compiler/index.md +++ b/docs/compiler/index.md @@ -19,11 +19,13 @@ license: | limitations under the License. --- -Fory Definition Language (FDL) is a schema definition language for Apache Fory that enables type-safe cross-language serialization. Define your data structures once and generate native data structure code for Java, Python, Go, Rust, and C++. +Fory IDL is a schema definition language for Apache Fory that enables type-safe +cross-language serialization. Define your data structures once and generate +native data structure code for Java, Python, Go, Rust, and C++. -## Overview +## Example Schema -FDL provides a simple, intuitive syntax for defining cross-language data structures: +Fory IDL provides a simple, intuitive syntax for defining cross-language data structures: ```protobuf package example; @@ -41,6 +43,11 @@ message User { list<string> tags = 4; } +message Item { + string sku = 1; + int32 quantity = 2; +} + message Order { ref User customer = 1; list<Item> items = 2; @@ -64,11 +71,11 @@ union Animal [id=106] { } ``` -## Why FDL? +## Why Fory IDL? ### Schema-First Development -Define your data model once in FDL and generate consistent, type-safe code across all languages. This ensures: +Define your data model once in Fory IDL and generate consistent, type-safe code across all languages. This ensures: - **Type Safety**: Catch type errors at compile time, not runtime - **Consistency**: All languages use the same field names, types, and structures @@ -77,14 +84,14 @@ Define your data model once in FDL and generate consistent, type-safe code acros ### Fory-Native Features -Unlike generic IDLs, FDL is designed specifically for Fory serialization: +Unlike generic IDLs, Fory IDL is designed specifically for Fory serialization: - **Reference Tracking**: First-class support for shared and circular references via `ref` - **Nullable Fields**: Explicit `optional` modifier for nullable types - **Type Registration**: Built-in support for both numeric IDs and namespace-based registration - **Native Code Generation**: Generates idiomatic code with Fory annotations/macros -### Zero Runtime Overhead +### Low Integration Overhead Generated code uses native language constructs: @@ -102,6 +109,13 @@ Generated code uses native language constructs: pip install fory-compiler ``` +Or install from source: + +```bash +cd compiler +pip install -e . +``` + ### 2. Write Your Schema Create `example.fdl`: @@ -151,7 +165,7 @@ data = bytes(person) # or `person.to_bytes()` | Document | Description | | ----------------------------------------------- | ------------------------------------------------- | -| [Fory Schema IDL](schema-idl.md) | Complete language syntax and grammar | +| [Fory IDL Syntax](schema-idl.md) | Complete language syntax and grammar | | [Type System](schema-idl.md#type-system) | Primitive types, collections, and type rules | | [Compiler Guide](compiler-guide.md) | CLI options and build integration | | [Generated Code](generated-code.md) | Output format for each target language | @@ -176,13 +190,13 @@ message Example { ### Cross-Language Compatibility -FDL types map to native types in each language: +Fory IDL types map to native types in each language: -| FDL Type | Java | Python | Go | Rust | C++ | -| -------- | --------- | ------ | -------- | -------- | ------------- | -| `int32` | `int` | `int` | `int32` | `i32` | `int32_t` | -| `string` | `String` | `str` | `string` | `String` | `std::string` | -| `bool` | `boolean` | `bool` | `bool` | `bool` | `bool` | +| Fory IDL Type | Java | Python | Go | Rust | C++ | +| ------------- | --------- | -------------- | -------- | -------- | ------------- | +| `int32` | `int` | `pyfory.int32` | `int32` | `i32` | `int32_t` | +| `string` | `String` | `str` | `string` | `String` | `std::string` | +| `bool` | `boolean` | `bool` | `bool` | `bool` | `bool` | See [Type System](schema-idl.md#type-system) for complete mappings. diff --git a/docs/compiler/protobuf-idl.md b/docs/compiler/protobuf-idl.md index b72b00095c..8292a00f5e 100644 --- a/docs/compiler/protobuf-idl.md +++ b/docs/compiler/protobuf-idl.md @@ -1,5 +1,5 @@ --- -title: Protocol Buffers IDL +title: Protobuf IDL Support sidebar_position: 10 id: protobuf_idl_support license: | @@ -19,72 +19,77 @@ license: | limitations under the License. --- -This document compares Google's Protocol Buffers (protobuf) with Fory Definition Language (FDL), helping you understand when to use each and how to migrate between them. +This page explains how Apache Fory works with Protocol Buffers (`.proto`) schemas, +how protobuf concepts map to Fory, and how to use protobuf-only Fory extension options. -## Overview +## What This Page Covers -| Aspect | Protocol Buffers | FDL | -| ---------------------- | --------------------------------- | ----------------------------------- | -| **Primary Purpose** | RPC and message interchange | Cross-language object serialization | -| **Design Philosophy** | Schema evolution, backward compat | Performance, native integration | -| **Reference Tracking** | Not supported | First-class support (`ref`) | -| **Generated Code** | Custom message types | Native language constructs | -| **Serialization** | Tag-length-value encoding | Fory binary protocol | -| **Performance** | Good | Excellent (up to 170x faster) | +- Choosing protobuf vs Fory for your use case +- Syntax and semantic differences that matter during migration +- Supported Fory extension options in protobuf files +- Practical migration patterns from protobuf to Fory -## Syntax Comparison +## Quick Decision Guide -### Package Declaration +| Situation | Recommended Format | +| ------------------------------------------------------------- | ------------------ | +| You are building gRPC APIs and rely on protobuf tooling | Protocol Buffers | +| You need maximum object-graph performance and ref tracking | Fory | +| You need circular/shared references in serialized data | Fory | +| You need strong unknown-field behavior for wire compatibility | Protocol Buffers | +| You need native structs/classes instead of protobuf wrappers | Fory | -**Protocol Buffers:** +## Protobuf vs Fory at a Glance -```protobuf -syntax = "proto3"; -package example.models; -option java_package = "com.example.models"; -option go_package = "example.com/models"; -``` +| Aspect | Protocol Buffers | Fory | +| ------------------ | ----------------------------- | ------------------------------------- | +| Primary purpose | RPC/message contracts | High-performance object serialization | +| Encoding model | Tag-length-value | Fory binary protocol | +| Reference tracking | Not built-in | First-class (`ref`) | +| Circular refs | Not supported | Supported | +| Unknown fields | Preserved | Not preserved | +| Generated types | Protobuf-specific model types | Native language constructs | +| gRPC ecosystem | Native | In progress (active development) | -**FDL:** +Fory gRPC support is under active development. For production gRPC +workflows today, protobuf remains the mature/default choice. -```protobuf -package example.models; -``` +## Why Use Apache Fory -FDL uses a single package declaration that maps to all languages automatically. +- Idiomatic generated code: Fory IDL generates language-idiomatic classes and + structs that can be used directly as domain objects. +- Faster serialization: In Fory benchmarks, Fory can be around 10x faster than + protobuf for object serialization workloads. +- Better graph modeling: Shared and circular references are first-class features + instead of application-level ID-link workarounds. -### Enum Definition +See benchmark details under [Performance References](#performance-references). -**Protocol Buffers:** +## Syntax and Semantic Mapping + +### Package and File Options + +**Protocol Buffers** ```protobuf -enum Status { - STATUS_UNSPECIFIED = 0; - STATUS_PENDING = 1; - STATUS_ACTIVE = 2; - STATUS_COMPLETED = 3; -} +syntax = "proto3"; +package example.models; +option java_package = "com.example.models"; +option go_package = "example.com/models"; ``` -**FDL:** +**Fory** ```protobuf -enum Status [id=100] { - PENDING = 0; - ACTIVE = 1; - COMPLETED = 2; -} +package example.models; ``` -Key differences: +Fory uses one package namespace for cross-language registration. Language-specific +package placement is still configurable in code generation. -- FDL supports optional type IDs (`[id=100]`) for efficient serialization -- Protobuf requires `_UNSPECIFIED = 0` by convention; FDL uses explicit values -- FDL enum values don't require prefixes +### Message and Enum Definitions -### Message Definition - -**Protocol Buffers:** +**Protocol Buffers** ```protobuf message User { @@ -95,9 +100,14 @@ message User { repeated string tags = 5; map<string, string> metadata = 6; } + +enum Status { + STATUS_UNSPECIFIED = 0; + STATUS_ACTIVE = 1; +} ``` -**FDL:** +**Fory** ```protobuf message User [id=101] { @@ -108,409 +118,218 @@ message User [id=101] { list<string> tags = 5; map<string, string> metadata = 6; } + +enum Status [id=102] { + UNKNOWN = 0; + ACTIVE = 1; +} ``` -Syntax is nearly identical, but FDL adds: +Key differences: + +- Fory can assign stable type IDs directly (`[id=...]`). +- Fory uses `list<T>` (with `repeated T` as alias). +- Enum naming conventions are language-driven instead of protobuf prefix style. -- Type IDs (`[id=101]`) for cross-language registration -- `ref` modifier for reference tracking +### `oneof` to `union` -### Nested Types +Protobuf `oneof` is translated to a nested Fory `union` plus an optional field +referencing that union. -**Protocol Buffers:** +**Protocol Buffers** ```protobuf -message Order { - message Item { - string product_id = 1; - int32 quantity = 2; +message Event { + oneof payload { + string text = 1; + int32 number = 2; } - repeated Item items = 1; } ``` -**FDL:** +**Fory-style shape after translation** ```protobuf -message OrderItem [id=200] { - string product_id = 1; - int32 quantity = 2; -} - -message Order [id=201] { - list<OrderItem> items = 1; +message Event { + union payload { + string text = 1; + int32 number = 2; + } + optional payload payload = 1; } ``` -FDL supports nested types, but generators may flatten them for languages where nested types are not idiomatic. - -### Imports - -**Protocol Buffers:** - -```protobuf -import "other.proto"; -import "google/protobuf/timestamp.proto"; -``` - -**FDL:** - -FDL currently requires all types in a single file or uses forward references within the same file. +Notes: -## Feature Comparison +- Union case IDs are derived from the original `oneof` field numbers. +- The synthetic union field uses the smallest `oneof` case number. -### Reference Tracking +### Imports and Well-Known Types -FDL's killer feature is first-class reference tracking: +Protobuf imports are supported. Common well-known types map directly: -**FDL:** +- `google.protobuf.Timestamp` -> `timestamp` +- `google.protobuf.Duration` -> `duration` +- `google.protobuf.Any` -> `any` -```protobuf -message TreeNode [id=300] { - string value = 1; - ref TreeNode parent = 2; - list<ref TreeNode> children = 3; // Element refs - ref list<TreeNode> path = 4; // Collection ref -} +## Type Mapping Highlights -message Graph [id=301] { - list<ref Node> nodes = 1; // Shared references preserved (elements) -} -``` +| Protobuf Type | Fory Mapping | +| ---------------------------------------- | ---------------------------------------- | +| `bool` | `bool` | +| `int32`, `uint32` | variable-length 32-bit integer kinds | +| `sint32` | zigzag 32-bit integer | +| `int64`, `uint64` | variable-length 64-bit integer kinds | +| `sint64` | zigzag 64-bit integer | +| `fixed32`, `fixed64` | fixed-width unsigned integer kinds | +| `sfixed32`, `sfixed64` | fixed-width signed integer kinds | +| `float`, `double` | `float32`, `float64` | +| `string`, `bytes` | `string`, `bytes` | +| `repeated T` | `list<T>` | +| `map<K, V>` | `map<K, V>` | +| `optional T` | `optional T` | +| `oneof` | `union` + optional union reference field | +| `int64 [(fory).type = "tagged_int64"]` | `tagged_int64` encoding | +| `uint64 [(fory).type = "tagged_uint64"]` | `tagged_uint64` encoding | -**Protocol Buffers:** +## Fory Extension Options (Protobuf) -Protobuf cannot represent circular or shared references. You must use workarounds: +Fory-specific options in `.proto` use the `(fory).` prefix. ```protobuf -// Workaround: Use IDs instead of references -message TreeNode { - string id = 1; - string value = 2; - string parent_id = 3; // Manual ID reference - repeated string child_ids = 4; -} -``` - -When using protobuf IDL with the Fory compiler, you can opt into reference -tracking via Fory extension options. `weak_ref` implies `ref`, while -`thread_safe_pointer` does not: +option (fory).enable_auto_type_id = true; -```protobuf message TreeNode { TreeNode parent = 1 [(fory).weak_ref = true]; - TreeNode child = 2 [(fory).ref = true, (fory).thread_safe_pointer = false]; + repeated TreeNode children = 2 [(fory).ref = true]; } ``` -### Type System - -| Type | Protocol Buffers | FDL | -| ---------- | ------------------------------------------------------------------------------------------------------ | ----------------------------------------------------------------------------------------------- | -| Boolean | `bool` | `bool` | -| Integers | `int32`, `int64`, `sint32`, `sint64`, `uint32`, `uint64`, `fixed32`, `fixed64`, `sfixed32`, `sfixed64` | `int8`, `int16`, `int32`, `int64`, `uint8`, `uint16`, `uint32`, `uint64`, `fixed_*`, `tagged_*` | -| Floats | `float`, `double` | `float32`, `float64` | -| String | `string` | `string` | -| Binary | `bytes` | `bytes` | -| Timestamp | `google.protobuf.Timestamp` | `timestamp` | -| Date | Not built-in | `date` | -| Duration | `google.protobuf.Duration` | Not built-in | -| List | `repeated T` | `list<T>` (alias: `repeated T`) | -| Map | `map<K, V>` | `map<K, V>` | -| Nullable | `optional T` (proto3) | `optional T` | -| Oneof | `oneof` | `union` (case id = field number) | -| Any | `google.protobuf.Any` | `any` | -| Extensions | `extend` | Not supported | - -### Wire Format - -**Protocol Buffers:** - -- Tag-length-value encoding -- Variable-length integers (varints) -- Field numbers encoded in wire format -- Unknown fields preserved - -**FDL/Fory:** - -- Optimized binary format -- Schema-aware encoding -- Type IDs for fast lookup -- Reference tracking support -- Zero-copy deserialization where possible - -### Generated Code Style - -**Protocol Buffers** generates custom types with builders and accessors: - -```java -// Protobuf generated Java -User user = User.newBuilder() - .setId("u123") - .setName("Alice") - .setAge(30) - .build(); -``` - -**FDL** generates native POJOs: - -```java -// FDL generated Java -User user = new User(); -user.setId("u123"); -user.setName("Alice"); -user.setAge(30); -``` - -### Comparison Table - -| Feature | Protocol Buffers | FDL | -| -------------------------- | ----------------- | --------- | -| Schema evolution | Excellent | Good | -| Backward compatibility | Excellent | Good | -| Reference tracking | No | Yes | -| Circular references | No | Yes | -| Native code generation | No (custom types) | Yes | -| Unknown field preservation | Yes | No | -| Schema-less mode | No | Yes\* | -| RPC integration (gRPC) | Yes | No | -| Zero-copy deserialization | Limited | Yes | -| Human-readable format | JSON, TextFormat | No | -| Performance | Good | Excellent | - -\*Fory supports schema-less serialization without FDL - -## When to Use Each - -### Use Protocol Buffers When: - -1. **Building gRPC services**: Protobuf is the native format for gRPC -2. **Maximum backward compatibility**: Protobuf's unknown field handling is robust -3. **Schema evolution is critical**: Adding/removing fields across versions -4. **You need Any types**: Protobuf-specific dynamic payloads -5. **Human-readable debugging**: TextFormat and JSON transcoding available -6. **Ecosystem integration**: Wide tooling support (linting, documentation) - -### Use FDL/Fory When: - -1. **Performance is critical**: Up to 170x faster than protobuf -2. **Cross-language object graphs**: Serialize Java objects, deserialize in Python -3. **Circular/shared references**: Object graphs with cycles -4. **Native code preferred**: Standard POJOs, dataclasses, structs -5. **Memory efficiency**: Zero-copy deserialization -6. **Existing object models**: Minimal changes to existing code - -## Performance Comparison +### File-Level Options -Benchmarks show Fory significantly outperforms Protocol Buffers, see more details from below links: +| Option | Type | Description | +| ------------------------------------ | ------ | -------------------------------------------------------------- | +| `(fory).use_record_for_java_message` | bool | Generate Java records for all messages in this file | +| `(fory).polymorphism` | bool | Enable polymorphic serialization metadata by default | +| `(fory).enable_auto_type_id` | bool | Auto-generate type IDs when omitted (compiler default is true) | +| `(fory).evolving` | bool | Default schema-evolution behavior for messages | +| `(fory).go_nested_type_style` | string | Go nested naming style: `underscore` (default) or `camelcase` | -- Benchmark result: https://fory.apache.org/docs/introduction/benchmark -- Benchmark code: https://github.com/apache/fory/tree/main/benchmarks -- Benchmark docs: https://github.com/apache/fory/tree/main/benchmarks +### Message and Enum Options -## Migration Guide +| Option | Applies To | Type | Description | +| ---------------------------- | ------------- | ------ | ---------------------------------------- | +| `(fory).id` | message, enum | int | Explicit type ID for registration | +| `(fory).alias` | message, enum | string | Alternate name used for auto-ID hashing | +| `(fory).evolving` | message | bool | Override file-level evolution setting | +| `(fory).use_record_for_java` | message | bool | Generate Java record for this message | +| `(fory).deprecated` | message, enum | bool | Mark type as deprecated | +| `(fory).namespace` | message | string | Override default package-based namespace | -### From Protocol Buffers to FDL +### Field-Level Options -#### Step 1: Convert Syntax +| Option | Type | Description | +| ---------------------------- | ------ | ------------------------------------------------------------ | +| `(fory).ref` | bool | Enable reference tracking for this field | +| `(fory).nullable` | bool | Treat field as nullable (`optional`) | +| `(fory).weak_ref` | bool | Generate weak pointer semantics (C++/Rust codegen) | +| `(fory).thread_safe_pointer` | bool | Rust pointer flavor for ref fields (`Arc` vs `Rc`) | +| `(fory).deprecated` | bool | Mark field as deprecated | +| `(fory).type` | string | Primitive override, currently `tagged_int64`/`tagged_uint64` | -**Before (proto):** +Reference option behavior: -```protobuf -syntax = "proto3"; -package myapp; +- `weak_ref = true` implies ref tracking. +- For `repeated` fields, `(fory).ref = true` applies to list elements. +- For `map<K, V>` fields, `(fory).ref = true` applies to map values. +- `weak_ref` and `thread_safe_pointer` are codegen hints for C++/Rust. -message Person { - string name = 1; - int32 age = 2; - repeated string emails = 3; - Address address = 4; -} - -message Address { - string street = 1; - string city = 2; -} -``` - -**After (FDL):** +### Option Examples by Shape ```protobuf -package myapp; - -message Address [id=100] { - string street = 1; - string city = 2; -} - -message Person [id=101] { - string name = 1; - int32 age = 2; - list<string> emails = 3; - Address address = 4; +message Graph { + Node root = 1 [(fory).ref = true, (fory).thread_safe_pointer = false]; + repeated Node nodes = 2 [(fory).ref = true]; + map<string, Node> cache = 3 [(fory).ref = true]; + Node parent = 4 [(fory).weak_ref = true]; } ``` -#### Step 2: Handle Special Cases +## Reference Tracking vs Protobuf IDs -**oneof fields:** +Protobuf itself does not preserve shared/cyclic object graphs. With Fory +protobuf extensions, you can opt into graph semantics. -```protobuf -// Proto -message Event { - oneof payload { - string text = 1; - int32 number = 2; - } -} -``` +**Without Fory ref options (protobuf-style IDs):** ```protobuf -// FDL - oneof becomes a nested union plus an optional field referencing it -message Event [id=102] { - union payload { - string text = 1; - int32 number = 2; - } - optional payload payload = 1; +message TreeNode { + string id = 1; + string parent_id = 2; + repeated string child_ids = 3; } -// Case ids are preserved from the oneof field numbers -// The field number is the smallest case id in the oneof ``` -**Well-known types:** +**With Fory ref options (object graph):** ```protobuf -// Proto -import "google/protobuf/timestamp.proto"; -message Event { - google.protobuf.Timestamp created_at = 1; -} -``` - -```protobuf -// FDL -message Event [id=103] { - timestamp created_at = 1; +message TreeNode { + TreeNode parent = 1 [(fory).weak_ref = true]; + repeated TreeNode children = 2 [(fory).ref = true]; } ``` -#### Step 3: Add Type IDs - -Assign unique type IDs for cross-language compatibility: - -```protobuf -// Reserve ranges for different domains -// 100-199: Common types -// 200-299: User domain -// 300-399: Order domain - -message Address [id=100] { ... } -message Person [id=200] { ... } -message Order [id=300] { ... } -``` - -#### Step 4: Update Build Configuration +## Migration Guide: Protobuf to Fory -**Before (Maven with protobuf):** +### Step 1: Translate Schema Syntax -```xml -<plugin> - <groupId>org.xolstice.maven.plugins</groupId> - <artifactId>protobuf-maven-plugin</artifactId> - <!-- ... --> -</plugin> -``` +- Keep package names stable. +- Replace `repeated T` with `list<T>` (or keep `repeated` alias). +- Add explicit `[id=...]` where you need stable numeric registration. -**After (Maven with FDL):** - -```xml -<plugin> - <groupId>org.codehaus.mojo</groupId> - <artifactId>exec-maven-plugin</artifactId> - <executions> - <execution> - <id>generate-fory-types</id> - <phase>generate-sources</phase> - <goals><goal>exec</goal></goals> - <configuration> - <executable>fory</executable> - <arguments> - <argument>compile</argument> - <argument>${project.basedir}/src/main/fdl/schema.fdl</argument> - <argument>--lang</argument> - <argument>java</argument> - <argument>--output</argument> - <argument>${project.build.directory}/generated-sources/fdl</argument> - </arguments> - </configuration> - </execution> - </executions> -</plugin> -``` +### Step 2: Convert `oneof` and Special Types -#### Step 5: Update Application Code +- `oneof` -> `union` + optional union field. +- Map protobuf well-known types to Fory primitives (`timestamp`, `duration`, `any`). -**Before (Protobuf Java):** +### Step 3: Replace Protobuf Workarounds with `ref` -```java -// Protobuf style -Person.Builder builder = Person.newBuilder(); -builder.setName("Alice"); -builder.setAge(30); -Person person = builder.build(); - -byte[] data = person.toByteArray(); -Person restored = Person.parseFrom(data); -``` +Where protobuf used manual ID links for object graphs, switch to Fory `ref` +modifiers (and optional `ref(weak=true)` where needed). -**After (Fory Java):** +### Step 4: Update Build/Codegen -```java -// Fory style -Person person = new Person(); -person.setName("Alice"); -person.setAge(30); +Replace protobuf generation steps with the Fory compiler invocation for target +languages. -Fory fory = Fory.builder().withLanguage(Language.XLANG).build(); -MyappForyRegistration.register(fory); +### Step 5: Run Compatibility Checks -byte[] data = fory.serialize(person); -Person restored = (Person) fory.deserialize(data); -``` +For staged migrations, keep both formats in parallel and verify payload-level +parity with integration tests. -### Coexistence Strategy +## Coexistence Strategy -For gradual migration, you can run both systems in parallel: +You can run protobuf and Fory in parallel during migration: ```java -// Dual serialization during migration public byte[] serialize(Object obj, Format format) { if (format == Format.PROTOBUF) { return ((MessageLite) obj).toByteArray(); - } else { - return fory.serialize(obj); } -} - -// Convert between formats -public ForyPerson fromProto(ProtoPerson proto) { - ForyPerson person = new ForyPerson(); - person.setName(proto.getName()); - person.setAge(proto.getAge()); - return person; + return fory.serialize(obj); } ``` -## Summary +Use translators at service boundaries while internal object-graph heavy paths +migrate first. -| Aspect | Choose Protocol Buffers | Choose FDL/Fory | -| ---------------- | ----------------------- | ---------------------- | -| Use case | RPC, API contracts | Object serialization | -| Performance | Acceptable | Critical | -| References | Not needed | Circular/shared needed | -| Code style | Builder pattern OK | Native POJOs preferred | -| Schema evolution | Complex requirements | Simpler requirements | -| Ecosystem | Need gRPC, tooling | Need raw performance | +## Performance References + +- Benchmarks: https://fory.apache.org/docs/introduction/benchmark +- Benchmark code: https://github.com/apache/fory/tree/main/benchmarks + +## Summary -Both tools excel in their domains. Protocol Buffers shines for RPC and API contracts with strong schema evolution guarantees. FDL/Fory excels at high-performance object serialization with native language integration and reference tracking support. +Use protobuf when your primary concern is API contracts and gRPC ecosystem +integration. Use Fory when object-graph performance, native models, and +reference semantics are the primary concern. diff --git a/docs/compiler/schema-idl.md b/docs/compiler/schema-idl.md index 69944c89a6..869187cd0e 100644 --- a/docs/compiler/schema-idl.md +++ b/docs/compiler/schema-idl.md @@ -19,20 +19,29 @@ license: | limitations under the License. --- -This document provides a complete reference for the Fory Definition Language (FDL) syntax. +This document provides the syntax and semantic reference for Fory IDL. + +For compiler usage and build integration, see +[Compiler Guide](compiler-guide.md). For protobuf/FlatBuffers frontend mapping +rules, see [Protocol Buffers IDL Support](protobuf-idl.md) and +[FlatBuffers IDL Support](flatbuffers-idl.md). ## File Structure -An FDL file consists of: +An Fory IDL file typically consists of: 1. Optional package declaration -2. Optional import statements -3. Type definitions (enums, messages, and unions) +2. Optional file-level options +3. Optional import statements +4. Type definitions (enums, messages, and unions) ```protobuf // Optional package declaration package com.example.models; +// Optional file-level options +option java_package = "com.example.models"; + // Import statements import "common/types.fdl"; @@ -45,7 +54,7 @@ union Event [id=103] { ... } ## Comments -FDL supports both single-line and block comments: +Fory IDL supports both single-line and block comments: ```protobuf // This is a single-line comment @@ -119,7 +128,7 @@ message Payment { - Generated Java files will be in `com/mycorp/payment/v1/` directory - Java package declaration will be `package com.mycorp.payment.v1;` -- Type registration still uses the FDL package (`payment`) for cross-language compatibility +- Type registration still uses the Fory IDL package (`payment`) for cross-language compatibility ### Go Package Option @@ -140,7 +149,7 @@ message Payment { - Generated Go files will have `package paymentv1` - The import path can be used in other Go code -- Type registration still uses the FDL package (`payment`) for cross-language compatibility +- Type registration still uses the Fory IDL package (`payment`) for cross-language compatibility ### Java Outer Classname Option @@ -263,17 +272,31 @@ message Payment { } ``` -### Fory Extension Options +### Protobuf Compatibility Options -FDL supports protobuf-style extension options for Fory-specific configuration: +Fory IDL accepts protobuf-style extension syntax (for example, `(fory).id`) for +compatibility, but native Fory IDL style uses plain option keys such as `id`, +`evolving`, `ref`, and `nullable` without the `(fory)` prefix. + +Equivalent forms: ```protobuf -option (fory).use_record_for_java_message = true; -option (fory).polymorphism = true; -option (fory).enable_auto_type_id = true; +// Native Fory IDL style (preferred in .fdl files) +message Node [id=100] { + ref Node parent = 1; + optional string nickname = 2; +} + +// Protobuf-style compatibility syntax +message Node { + option (fory).id = 100; + Node parent = 1 [(fory).ref = true]; + string nickname = 2 [(fory).nullable = true]; +} ``` -See the [Fory Extension Options](#fory-extension-options) section for the complete list of file, message, enum, union, and field options. +For the protobuf-specific extension option guide, see +[Protocol Buffers IDL Support](protobuf-idl.md#fory-extension-options-protobuf). ### Option Priority @@ -281,7 +304,7 @@ For language-specific packages: 1. Command-line package override (highest priority) 2. Language-specific option (`java_package`, `go_package`) -3. FDL package declaration (fallback) +3. Fory IDL package declaration (fallback) **Example:** @@ -318,7 +341,7 @@ All languages will register `User` with namespace `myapp.models`, enabling: ## Import Statement -Import statements allow you to use types defined in other FDL files. +Import statements allow you to use types defined in other Fory IDL files. ### Basic Syntax @@ -400,9 +423,9 @@ import public "other.fdl"; import weak "other.fdl"; ``` -**`import public`**: FDL uses a simpler import model. All imported types are available to the importing file only. Re-exporting is not supported. Import each file directly where needed. +**`import public`**: Fory IDL uses a simpler import model. All imported types are available to the importing file only. Re-exporting is not supported. Import each file directly where needed. -**`import weak`**: FDL requires all imports to be present at compile time. Optional dependencies are not supported. +**`import weak`**: Fory IDL requires all imports to be present at compile time. Optional dependencies are not supported. ### Import Errors @@ -837,7 +860,8 @@ message Node { | C++ | `Node parent` | `std::shared_ptr<Node> parent` | Rust uses `Arc` by default; use `ref(thread_safe=false)` or `ref(weak=true)` -to customize pointer types (see [Field-Level Fory Options](#field-level-fory-options)). +to customize pointer types. For protobuf option syntax, see +[Protocol Buffers IDL Support](protobuf-idl.md#field-level-options). #### `list` @@ -879,14 +903,14 @@ apply to elements. `repeated` is accepted as an alias for `list`. **List modifier mapping:** -| FDL | Java | Python | Go | Rust | C++ | +| Fory IDL | Java | Python | Go | Rust | C++ | | ----------------------- | ---------------------------------------------- | --------------------------------------- | ----------------------- | --------------------- | ----------------------------------------- | | `optional list<string>` | `List<String>` + `@ForyField(nullable = true)` | `Optional[List[str]]` | `[]string` + `nullable` | `Option<Vec<String>>` | `std::optional<std::vector<std::string>>` | | `list<optional string>` | `List<String>` (nullable elements) | `List[Optional[str]]` | `[]*string` | `Vec<Option<String>>` | `std::vector<std::optional<std::string>>` | | `ref list<User>` | `List<User>` + `@ForyField(ref = true)` | `List[User]` + `pyfory.field(ref=True)` | `[]User` + `ref` | `Arc<Vec<User>>` | `std::shared_ptr<std::vector<User>>` | | `list<ref User>` | `List<User>` | `List[User]` | `[]*User` + `ref=false` | `Vec<Arc<User>>` | `std::vector<std::shared_ptr<User>>` | -Use `ref(thread_safe=false)` in FDL (or `[(fory).thread_safe_pointer = false]` in protobuf) +Use `ref(thread_safe=false)` in Fory IDL (or `[(fory).thread_safe_pointer = false]` in protobuf) to generate `Rc` instead of `Arc` in Rust. ## Field Numbers @@ -901,24 +925,20 @@ message Example { } ``` -**Rules:** - -- Must be unique within a message -- Must be positive integers -- Used for field ordering and identification -- Gaps in numbering are allowed (useful for deprecating fields) +**Rules and best practices:** -**Best Practices:** - -- Use sequential numbers starting from 1 -- Reserve number ranges for different categories -- Never reuse numbers for different fields (even after deletion) +- Numbers must be unique within a message. +- Numbers must be positive integers. +- Gaps are allowed and are useful when fields are removed. +- Prefer sequential numbering from `1`. +- Never reuse a removed field number for a different field. ## Type System -FDL provides a cross-language type system for primitives, named types, and collections. -Field modifiers like `optional`, `list`, and `ref` define nullability, collections, and -reference tracking (see [Field Modifiers](#field-modifiers)). +Fory IDL provides a cross-language type system for primitives, named types, and +collections. Field modifiers (`optional`, `list`, `ref`) control nullability, +collection behavior, and reference tracking (see +[Field Modifiers](#field-modifiers)). ### Primitive Types @@ -951,10 +971,6 @@ reference tracking (see [Field Modifiers](#field-modifiers)). #### Boolean -```protobuf -bool is_active = 1; -``` - | Language | Type | Notes | | -------- | --------------------- | ------------------ | | Java | `boolean` / `Boolean` | Primitive or boxed | @@ -965,27 +981,27 @@ bool is_active = 1; #### Integer Types -FDL provides fixed-width signed integers (varint encoding for 32/64-bit by default): +Fory IDL provides fixed-width signed integers (varint encoding for 32/64-bit by default): -| FDL Type | Size | Range | -| -------- | ------ | ----------------- | -| `int8` | 8-bit | -128 to 127 | -| `int16` | 16-bit | -32,768 to 32,767 | -| `int32` | 32-bit | -2^31 to 2^31 - 1 | -| `int64` | 64-bit | -2^63 to 2^63 - 1 | +| Fory IDL Type | Size | Range | +| ------------- | ------ | ----------------- | +| `int8` | 8-bit | -128 to 127 | +| `int16` | 16-bit | -32,768 to 32,767 | +| `int32` | 32-bit | -2^31 to 2^31 - 1 | +| `int64` | 64-bit | -2^63 to 2^63 - 1 | **Language Mapping (Signed):** -| FDL | Java | Python | Go | Rust | C++ | -| ------- | ------- | -------------- | ------- | ----- | --------- | -| `int8` | `byte` | `pyfory.int8` | `int8` | `i8` | `int8_t` | -| `int16` | `short` | `pyfory.int16` | `int16` | `i16` | `int16_t` | -| `int32` | `int` | `pyfory.int32` | `int32` | `i32` | `int32_t` | -| `int64` | `long` | `pyfory.int64` | `int64` | `i64` | `int64_t` | +| Fory IDL | Java | Python | Go | Rust | C++ | +| -------- | ------- | -------------- | ------- | ----- | --------- | +| `int8` | `byte` | `pyfory.int8` | `int8` | `i8` | `int8_t` | +| `int16` | `short` | `pyfory.int16` | `int16` | `i16` | `int16_t` | +| `int32` | `int` | `pyfory.int32` | `int32` | `i32` | `int32_t` | +| `int64` | `long` | `pyfory.int64` | `int64` | `i64` | `int64_t` | -FDL provides fixed-width unsigned integers (varint encoding for 32/64-bit by default): +Fory IDL provides fixed-width unsigned integers (varint encoding for 32/64-bit by default): -| FDL | Size | Range | +| Fory IDL | Size | Range | | -------- | ------ | ------------- | | `uint8` | 8-bit | 0 to 255 | | `uint16` | 16-bit | 0 to 65,535 | @@ -994,44 +1010,19 @@ FDL provides fixed-width unsigned integers (varint encoding for 32/64-bit by def **Language Mapping (Unsigned):** -| FDL | Java | Python | Go | Rust | C++ | +| Fory IDL | Java | Python | Go | Rust | C++ | | -------- | ------- | --------------- | -------- | ----- | ---------- | | `uint8` | `short` | `pyfory.uint8` | `uint8` | `u8` | `uint8_t` | | `uint16` | `int` | `pyfory.uint16` | `uint16` | `u16` | `uint16_t` | | `uint32` | `long` | `pyfory.uint32` | `uint32` | `u32` | `uint32_t` | | `uint64` | `long` | `pyfory.uint64` | `uint64` | `u64` | `uint64_t` | -**Examples:** - -```protobuf -message Counters { - int8 tiny = 1; - int16 small = 2; - int32 medium = 3; - int64 large = 4; -} -``` - -**Python type hints:** - -```python -from dataclasses import dataclass -from pyfory import int8, int16, int32 - -@dataclass -class Counters: - tiny: int8 - small: int16 - medium: int32 - large: int # int64 maps to native int -``` - #### Integer Encoding Variants -For 32/64-bit integers, FDL uses varint encoding by default. Use explicit types when +For 32/64-bit integers, Fory IDL uses varint encoding by default. Use explicit types when you need fixed-width or tagged encoding: -| FDL Type | Encoding | Notes | +| Fory IDL Type | Encoding | Notes | | --------------- | -------- | ------------------------ | | `fixed_int32` | fixed | Signed 32-bit | | `fixed_int64` | fixed | Signed 64-bit | @@ -1042,26 +1033,20 @@ you need fixed-width or tagged encoding: #### Floating-Point Types -| FDL Type | Size | Precision | -| --------- | ------ | ------------- | -| `float32` | 32-bit | ~7 digits | -| `float64` | 64-bit | ~15-16 digits | +| Fory IDL Type | Size | Precision | +| ------------- | ------ | ------------- | +| `float32` | 32-bit | ~7 digits | +| `float64` | 64-bit | ~15-16 digits | **Language Mapping:** -| FDL | Java | Python | Go | Rust | C++ | +| Fory IDL | Java | Python | Go | Rust | C++ | | --------- | -------- | ---------------- | --------- | ----- | -------- | | `float32` | `float` | `pyfory.float32` | `float32` | `f32` | `float` | | `float64` | `double` | `pyfory.float64` | `float64` | `f64` | `double` | #### String Type -UTF-8 encoded text: - -```protobuf -string name = 1; -``` - | Language | Type | Notes | | -------- | ------------- | --------------------- | | Java | `String` | Immutable | @@ -1072,12 +1057,6 @@ string name = 1; #### Bytes Type -Raw binary data: - -```protobuf -bytes data = 1; -``` - | Language | Type | Notes | | -------- | ---------------------- | --------- | | Java | `byte[]` | | @@ -1090,12 +1069,6 @@ bytes data = 1; ##### Date -Calendar date without time: - -```protobuf -date birth_date = 1; -``` - | Language | Type | Notes | | -------- | --------------------------- | ----------------------- | | Java | `java.time.LocalDate` | | @@ -1106,12 +1079,6 @@ date birth_date = 1; ##### Timestamp -Date and time with nanosecond precision: - -```protobuf -timestamp created_at = 1; -``` - | Language | Type | Notes | | -------- | -------------------------------- | ----------------------- | | Java | `java.time.Instant` | UTC-based | @@ -1122,12 +1089,6 @@ timestamp created_at = 1; #### Any -Dynamic value with runtime type information: - -```protobuf -any payload = 1; -``` - | Language | Type | Notes | | -------- | -------------- | -------------------- | | Java | `Object` | Runtime type written | @@ -1136,6 +1097,34 @@ any payload = 1; | Rust | `Box<dyn Any>` | Runtime type written | | C++ | `std::any` | Runtime type written | +**Example:** + +```protobuf +enum EventType [id=120] { + CREATED = 0; + DELETED = 1; +} + +message UserCreated [id=121] { + string user_id = 1; +} + +message Envelope [id=122] { + EventType type = 1; + any payload = 2; +} +``` + +**Generated Code (`Envelope.payload`):** + +| Language | Generated Field Type | +| -------- | ----------------------- | +| Java | `Object payload` | +| Python | `payload: Any` | +| Go | `Payload any` | +| Rust | `payload: Box<dyn Any>` | +| C++ | `std::any payload` | + **Notes:** - `any` always writes a null flag (same as `nullable`) because values may be empty. @@ -1185,7 +1174,7 @@ message Config { **Language Mapping:** -| FDL | Java | Python | Go | Rust | C++ | +| Fory IDL | Java | Python | Go | Rust | C++ | | -------------------- | ---------------------- | ----------------- | ------------------ | ----------------------- | -------------------------------- | | `map<string, int32>` | `Map<String, Integer>` | `Dict[str, int]` | `map[string]int32` | `HashMap<String, i32>` | `std::map<std::string, int32_t>` | | `map<string, User>` | `Map<String, User>` | `Dict[str, User]` | `map[string]User` | `HashMap<String, User>` | `std::map<std::string, User>` | @@ -1258,22 +1247,14 @@ message Config { ... } // Auto-generated when enable_auto_type_id = true You can set `[alias="..."]` to change the hash source without renaming the type. -### Pay-as-you-go principle - -- IDs: Messages, unions, and enums use numeric IDs; if omitted and - `enable_auto_type_id = true`, the compiler auto-generates one. -- Auto-generation: If no ID is provided, fory generates one using - MurmurHash3(utf8(package.type_name)) (32-bit). If a package alias is specified, - the alias is used instead of the package name; if a type alias is specified, - the alias is used instead of the type name. -- Space Efficiency: - - Manual IDs (0-127): Encoded as 1 byte (Varint). Ideal for high-frequency messages. - - Generated IDs: Usually large integers, taking 4-5 bytes in the wire format (varuint32). -- Conflict Resolution: While the collision probability is extremely low, conflicts are detected - at compile-time. The compiler raises an error and asks you to specify an explicit `id` or use - the `alias` option to change the hash source. +### Practical Notes -Explicit is better than implicit, but automation is better than toil. +- If a type omits `id` and `enable_auto_type_id = true`, Fory generates an ID + with `MurmurHash3(utf8(package.type_name))` (32-bit). +- Package alias and type alias change the hash input and can be used to resolve + hash collisions without renaming public types. +- Manual IDs in the small varint range (`0-127`) are compact on the wire; auto + IDs are typically larger and usually consume 4-5 bytes. ### ID Assignment Strategy @@ -1368,174 +1349,8 @@ message ShopConfig { } ``` -## Fory Extension Options - -FDL supports protobuf-style extension options for Fory-specific configuration. These use the `(fory)` prefix to indicate they are Fory extensions. - -### File-Level Fory Options - -```protobuf -option (fory).use_record_for_java_message = true; -option (fory).polymorphism = true; -option (fory).enable_auto_type_id = true; -option (fory).evolving = true; -``` - -| Option | Type | Description | -| ----------------------------- | ------ | ----------------------------------------------------------------------------------------------------------------------------------- | -| `use_record_for_java_message` | bool | Generate Java records instead of classes | -| `polymorphism` | bool | Enable polymorphism for all types | -| `enable_auto_type_id` | bool | Auto-generate numeric type IDs when omitted (default: true) | -| `evolving` | bool | Default schema evolution for messages in this file (default: true). Set false to reduce payload size for messages that never change | -| `go_nested_type_style` | string | Go nested type naming: `underscore` (default) or `camelcase` | - -### Message-Level Fory Options - -Options can be specified inside the message body: - -```protobuf -message MyMessage { - option (fory).id = 100; - option (fory).evolving = false; - option (fory).use_record_for_java = true; - string name = 1; -} -``` - -| Option | Type | Description | -| --------------------- | ------ | ------------------------------------------------------------------------------------------------------------------ | -| `id` | int | Type ID for serialization (auto-generated if omitted and enable_auto_type_id = true) | -| `alias` | string | Alternate name used as hash source for auto-generated IDs | -| `evolving` | bool | Schema evolution support (default: true). When false, schema is fixed like a struct and avoids compatible metadata | -| `use_record_for_java` | bool | Generate Java record for this message | -| `deprecated` | bool | Mark this message as deprecated | -| `namespace` | string | Custom namespace for type registration | - -**Note:** `option (fory).id = 100` is equivalent to the inline syntax `message MyMessage [id=100]`. - -### Union-Level Fory Options - -```protobuf -union MyUnion [id=100, alias="MyUnionAlias"] { - string text = 1; -} -``` - -| Option | Type | Description | -| ------------ | ------ | ------------------------------------------------------------------------------------ | -| `id` | int | Type ID for serialization (auto-generated if omitted and enable_auto_type_id = true) | -| `alias` | string | Alternate name used as hash source for auto-generated IDs | -| `deprecated` | bool | Mark this union as deprecated | - -### Enum-Level Fory Options - -```protobuf -enum Status { - option (fory).id = 101; - option (fory).deprecated = true; - UNKNOWN = 0; - ACTIVE = 1; -} -``` - -| Option | Type | Description | -| ------------ | ---- | ---------------------------------------- | -| `id` | int | Type ID for serialization (sets type_id) | -| `deprecated` | bool | Mark this enum as deprecated | - -### Field-Level Fory Options - -Field options are specified in brackets after the field number (FDL uses `ref` modifiers instead -of bracket options for reference settings): - -```protobuf -message Example { - ref MyType friend = 1; - string nickname = 2 [nullable = true]; - ref MyType data = 3 [nullable = true]; - ref(weak=true) MyType parent = 4; -} -``` - -| Option | Type | Description | -| --------------------- | ---- | --------------------------------------------------------- | -| `ref` | bool | Enable reference tracking (protobuf extension option) | -| `nullable` | bool | Mark field as nullable (sets optional flag) | -| `deprecated` | bool | Mark this field as deprecated | -| `thread_safe_pointer` | bool | Rust only: use `Arc` (true) or `Rc` (false) for ref types | -| `weak_ref` | bool | C++/Rust only: generate weak pointers for `ref` fields | - -**Note:** For FDL, use `ref` (and optional `ref(...)`) modifiers: -`ref MyType friend = 1;`, `list<ref(weak=true) Child> children = 2;`, -`map<string, ref(weak=true) Node> nodes = 3;`. For protobuf, use -`[(fory).ref = true]` and `[(fory).weak_ref = true]`. `weak_ref` is a codegen -hint for C++/Rust and is ignored by Java/Python/Go. It must be used with `ref` -(`list<ref T>` for collections, or `map<..., ref T>` for map values). - -To use `Rc` instead of `Arc` in Rust for a specific field: - -```fdl -message Graph { - ref(thread_safe=false) Node root = 1; -} -``` - -### Combining Standard and Fory Options - -You can combine standard options with Fory extension options: - -```protobuf -message User { - option deprecated = true; // Standard option - option (fory).evolving = false; // Fory extension option - - string name = 1; - MyType data = 2 [deprecated = true, (fory).ref = true]; -} -``` - -### Fory Options Proto File - -For reference, the Fory options are defined in `extension/fory_options.proto`: - -```protobuf -// File-level options -extend google.protobuf.FileOptions { - optional ForyFileOptions fory = 50001; -} - -message ForyFileOptions { - optional bool use_record_for_java_message = 1; - optional bool polymorphism = 2; - optional bool enable_auto_type_id = 3; - optional bool evolving = 4; -} - -// Message-level options -extend google.protobuf.MessageOptions { - optional ForyMessageOptions fory = 50001; -} - -message ForyMessageOptions { - optional int32 id = 1; - optional bool evolving = 2; - optional bool use_record_for_java = 3; - optional bool deprecated = 4; - optional string namespace = 5; -} - -// Field-level options -extend google.protobuf.FieldOptions { - optional ForyFieldOptions fory = 50001; -} - -message ForyFieldOptions { - optional bool ref = 1; - optional bool nullable = 2; - optional bool deprecated = 3; - optional bool weak_ref = 4; -} -``` +For protobuf-specific extension options and `(fory).` syntax, see +[Protocol Buffers IDL Support](protobuf-idl.md#fory-extension-options-protobuf). ## Grammar Summary @@ -1574,7 +1389,7 @@ reserved_item := INTEGER | INTEGER 'to' INTEGER | INTEGER 'to' 'max' | STRING modifiers := { 'optional' | 'ref' } ['list' { 'optional' | 'ref' }] -field_type := primitive_type | named_type | map_type +field_type := primitive_type | named_type | list_type | map_type primitive_type := 'bool' | 'int8' | 'int16' | 'int32' | 'int64' | 'uint8' | 'uint16' | 'uint32' | 'uint64' @@ -1586,6 +1401,7 @@ primitive_type := 'bool' | 'any' named_type := qualified_name qualified_name := IDENTIFIER ('.' IDENTIFIER)* // e.g., Parent.Child +list_type := 'list' '<' { 'optional' | 'ref' } field_type '>' map_type := 'map' '<' field_type ',' field_type '>' type_options := '[' type_option (',' type_option)* ']' --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
