chaokunyang opened a new issue, #3004:
URL: https://github.com/apache/fory/issues/3004
## Feature Request
Extend the `#[derive(ForyObject)]` macro to support `#[fory()]` field
attributes for performance and space optimization during xlang serialization.
## Is your feature request related to a problem? Please describe
Currently, Fory's Rust xlang serialization treats all struct fields
uniformly:
1. **Null checks are always performed** - Even for fields that are never
null, Fory writes a null/ref flag (1 byte per field)
2. **Reference tracking is always applied** (when enabled globally) - Even
for fields that won't be shared/cyclic, objects are tracked with hash lookup
cost
3. **Field names use meta string encoding** - In schema evolution mode,
field names are encoded using meta string compression, but for fields with long
names, this still takes space
These defaults ensure correctness but introduce unnecessary overhead when
the developer has more specific knowledge about their data model.
## Describe the solution you'd like
Extend the `#[fory()]` attribute to support field-level metadata:
```rust
use fory::ForyObject;
#[derive(ForyObject)]
struct Foo {
// Field f1: non-nullable (default), no ref tracking (default)
// Tag ID 0 provides compact encoding in schema evolution mode
#[fory(id = 0)]
f1: String,
// Field f2: non-nullable (default), no ref tracking (default)
#[fory(id = 1)]
f2: Bar,
// Field f3: nullable field that may contain null values
#[fory(id = 2, nullable)]
f3: Option<String>,
// Field f4: shared reference that needs tracking (e.g., for circular
refs)
#[fory(id = 3, ref, nullable)]
parent: Option<Rc<Node>>,
// Field with long name: tag ID provides significant space savings
#[fory(id = 4)]
very_long_field_name_that_would_take_many_bytes: String,
// Explicit opt-out: use field name encoding but get nullable
optimization
#[fory(id = -1, nullable)]
optional_field: Option<String>,
}
```
### Attribute Syntax
```rust
#[fory(
id = <i32>, // REQUIRED: Tag ID for field encoding
// >= 0: Use tag ID encoding
// -1: Use field name encoding (opt-out)
nullable, // Optional: Field can be None (default: false)
// Required for Option<T> types
ref, // Optional: Track references (default: false)
// Useful for Rc<T>, Arc<T>, circular references
)]
```
### Design Decision: Required `id`
The `id` attribute is **required** when using `#[fory()]` on a field:
- `id = 0` to `id = N`: Use tag ID encoding (compact)
- `id = -1`: Explicit opt-out, use field name encoding
Rationale:
1. **Explicit control**: Using `#[fory()]` means opting into explicit control
2. **Compile-time validation**: Proc macro can check for duplicate IDs
3. **Proven pattern**: Similar to protobuf field numbers
### Optimization Details
#### 1. Non-nullable (Default) Optimization
When `nullable` is NOT specified:
- Skip writing the null flag entirely (1 byte saved per field)
- Directly serialize the field value
- Compile error if field type is `Option<T>` without `nullable`
#### 2. No Ref Tracking (Default) Optimization
When `ref` is NOT specified:
- Skip reference tracking map operations
- Skip ref flag when combined with non-nullable
- For `Rc<T>`/`Arc<T>`, consider adding `ref` if circular refs are possible
#### 3. Tag ID Optimization
When `id = N` where N >= 0:
- Field name encoded as varint instead of meta string
- Significant space savings for long field names
**Space savings:**
| Field Name | Meta String (approx) | Tag ID |
|------------|---------------------|--------|
| `f1` | ~2 bytes | 1 byte |
| `user_name` | ~6 bytes | 1 byte |
| `transaction_id` | ~10 bytes | 1 byte |
### Implementation Notes
1. **Proc Macro Enhancement**:
```rust
// In fory-derive/src/object.rs
#[proc_macro_derive(ForyObject, attributes(fory))]
pub fn derive_fory_object(input: TokenStream) -> TokenStream {
// Parse #[fory(id = N, nullable, ref)] attributes
// Generate optimized serialization code based on attributes
}
```
2. **Code Generation**:
```rust
// Generated code for #[fory(id = 0)] (non-nullable, no ref)
fn serialize_field_f1(&self, writer: &mut Writer) {
// No null check, no ref tracking
writer.write_string(&self.f1);
}
// Generated code for #[fory(id = 2, nullable)]
fn serialize_field_f3(&self, writer: &mut Writer) {
match &self.f3 {
Some(v) => {
writer.write_not_null();
writer.write_string(v);
}
None => writer.write_null(),
}
}
```
3. **Compile-time Validation**:
- Error if duplicate tag IDs (>= 0) in same struct
- Error if `id < -1`
- Error if `Option<T>` field without `nullable`
- Warning if `Rc<T>`/`Arc<T>` without `ref` (potential circular ref
issues)
4. **Runtime Validation**:
- Panic if non-nullable field serialized with None value (shouldn't
happen in Rust)
### Example: Generated Code
```rust
#[derive(ForyObject)]
struct Foo {
#[fory(id = 0)]
name: String,
#[fory(id = 1, nullable)]
nickname: Option<String>,
}
// Generates approximately:
impl ForySerialize for Foo {
fn serialize(&self, writer: &mut Writer) -> Result<()> {
// Field: name (id=0, non-nullable, no ref)
writer.write_tag_id(0);
writer.write_string(&self.name)?;
// Field: nickname (id=1, nullable, no ref)
writer.write_tag_id(1);
match &self.nickname {
Some(v) => {
writer.write_byte(NOT_NULL_FLAG);
writer.write_string(v)?;
}
None => writer.write_byte(NULL_FLAG),
}
Ok(())
}
}
```
### Performance Impact
For a struct with 10 fields using default settings (non-nullable, no ref
tracking):
- **Space savings**: ~20 bytes per object (null + ref flags)
- **CPU savings**: 10 fewer hash map operations per serialization
- **Zero runtime overhead** for metadata (all compile-time via proc macro)
## Additional context
This is the Rust equivalent of Java's `@ForyField` annotation. See [Java
issue #3000](https://github.com/apache/fory/issues/3000) for the original
design discussion.
Protocol spec:
https://fory.apache.org/docs/specification/fory_xlang_serialization_spec
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]