chaokunyang opened a new issue, #3003:
URL: https://github.com/apache/fory/issues/3003
## Feature Request
Create a `fory::field<>` template class for field metadata to enable
performance and space optimization during xlang serialization.
## Is your feature request related to a problem? Please describe
Currently, Fory's C++ xlang serialization treats all struct fields uniformly:
1. **Null checks are always performed** - Even for fields that are never
null, Fory writes a null/ref flag (1 byte per field)
2. **Reference tracking is always applied** (when enabled globally) - Even
for fields that won't be shared/cyclic, objects are tracked with hash lookup
cost
3. **Field names use meta string encoding** - In schema evolution mode,
field names are encoded using meta string compression, but for fields with long
names, this still takes space
These defaults ensure correctness but introduce unnecessary overhead when
the developer has more specific knowledge about their data model.
## Describe the solution you'd like
Add a `fory::field<>` template in `field.h` that wraps field types with
compile-time metadata:
```cpp
#include <fory/serialization/field.h>
#include <string>
#include <memory>
struct Foo {
// Field f1: non-nullable (default), no ref tracking (default)
// Tag ID 0 provides compact encoding in schema evolution mode
fory::field<std::string, fory::id<0>> f1;
// Field f2: non-nullable (default), no ref tracking (default)
fory::field<Bar, fory::id<1>> f2;
// Field f3: nullable field that may contain null values
fory::field<std::optional<std::string>, fory::id<2>, fory::nullable> f3;
// Field f4: shared reference that needs tracking (e.g., for circular
refs)
fory::field<std::shared_ptr<Node>, fory::id<3>, fory::ref,
fory::nullable> parent;
// Field with long name: tag ID provides significant space savings
fory::field<std::string, fory::id<4>>
very_long_field_name_that_would_take_many_bytes;
// Explicit opt-out: use field name encoding but get nullable
optimization
fory::field<std::optional<std::string>, fory::id<-1>, fory::nullable>
optional_field;
};
// Register with Fory
FORY_REGISTER_TYPE(Foo);
```
### Template API Design
```cpp
namespace fory {
// Tag types for field properties
template<int N>
struct id { static constexpr int value = N; };
struct nullable { static constexpr bool value = true; };
struct ref { static constexpr bool value = true; };
// Field wrapper template
template<typename T, typename... Props>
class field {
public:
using value_type = T;
// Compile-time property extraction
static constexpr int tag_id = /* extract from Props... */;
static constexpr bool is_nullable = /* extract from Props... */;
static constexpr bool track_ref = /* extract from Props... */;
// Implicit conversion to/from T
field() = default;
field(const T& value) : value_(value) {}
field(T&& value) : value_(std::move(value)) {}
operator T&() { return value_; }
operator const T&() const { return value_; }
T& get() { return value_; }
const T& get() const { return value_; }
T* operator->() { return &value_; }
const T* operator->() const { return &value_; }
private:
T value_;
};
} // namespace fory
```
### Alternative: Macro-based Approach
For compatibility with existing codebases that can't change field types:
```cpp
struct Foo {
std::string f1;
Bar f2;
std::optional<std::string> f3;
std::shared_ptr<Node> parent;
};
// Define field metadata separately
FORY_FIELD_INFO(Foo,
FORY_FIELD(f1, id = 0),
FORY_FIELD(f2, id = 1),
FORY_FIELD(f3, id = 2, nullable = true),
FORY_FIELD(parent, id = 3, ref = true, nullable = true)
);
```
### Design Decision: Required `id`
The `id` template parameter is **required**:
- `fory::id<0>` to `fory::id<N>`: Use tag ID encoding
- `fory::id<-1>`: Explicit opt-out, use field name encoding
Rationale:
1. **Explicit control**: Using `fory::field<>` means opting into explicit
control
2. **Compile-time validation**: Template can static_assert uniqueness
3. **Proven pattern**: Similar to protobuf field numbers
### Optimization Details
#### 1. Non-nullable (Default) Optimization
When `nullable` tag is NOT present:
- Skip writing the null flag entirely (1 byte saved per field)
- Directly serialize the field value
- For `std::optional<T>`, must add `nullable` tag
#### 2. No Ref Tracking (Default) Optimization
When `ref` tag is NOT present:
- Skip reference tracking map operations
- Skip ref flag when combined with non-nullable
- For `std::shared_ptr<T>`, consider adding `ref` tag if circular refs are
possible
#### 3. Tag ID Optimization
When `id<N>` where N >= 0:
- Field name encoded as varint instead of meta string
- Significant space savings for long field names
**Space savings:**
| Field Name | Meta String (approx) | Tag ID |
|------------|---------------------|--------|
| `f1` | ~2 bytes | 1 byte |
| `user_name` | ~6 bytes | 1 byte |
| `transaction_id` | ~10 bytes | 1 byte |
### Implementation Notes
1. **Template Metaprogramming**:
- Use variadic templates to extract properties
- Provide `constexpr` accessors for compile-time queries
- Enable optimizations via `if constexpr`
2. **Serializer Integration**:
```cpp
template<typename T, typename... Props>
struct Serializer<fory::field<T, Props...>> {
static void write(Writer& writer, const fory::field<T, Props...>& f) {
if constexpr (!fory::field<T, Props...>::is_nullable) {
// Skip null check, directly serialize
Serializer<T>::write(writer, f.get());
} else {
// Write null flag, then value if not null
// ...
}
}
};
```
3. **Zero Overhead**:
- `fory::field<T, ...>` should have same memory layout as `T`
- All metadata is compile-time only
- No runtime overhead compared to raw field
4. **Validation**:
- `static_assert` for duplicate tag IDs at compile time
- `static_assert` for `id < -1`
- Runtime error if non-nullable field has null value
### Performance Impact
For a struct with 10 fields using default settings (non-nullable, no ref
tracking):
- **Space savings**: ~20 bytes per object (null + ref flags)
- **CPU savings**: 10 fewer hash map operations per serialization
- **Zero runtime overhead** for metadata (all compile-time)
## Additional context
This is the C++ equivalent of Java's `@ForyField` annotation. See [Java
issue #3000](https://github.com/apache/fory/issues/3000) for the original
design discussion.
Protocol spec:
https://fory.apache.org/docs/specification/fory_xlang_serialization_spec
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]