chaokunyang opened a new issue, #3005:
URL: https://github.com/apache/fory/issues/3005

   ## Feature Request
   
   Add support for `fory` struct tags to provide field-level metadata for 
performance and space optimization during xlang serialization.
   
   ## Is your feature request related to a problem? Please describe
   
   Currently, Fory's Go xlang serialization treats all struct fields uniformly:
   1. **Null checks are always performed** - Even for fields that are never 
nil, Fory writes a null/ref flag (1 byte per field)
   2. **Reference tracking is always applied** (when enabled globally) - Even 
for fields that won't be shared/cyclic, objects are tracked with hash lookup 
cost
   3. **Field names use meta string encoding** - In schema evolution mode, 
field names are encoded using meta string compression, but for fields with long 
names, this still takes space
   
   These defaults ensure correctness but introduce unnecessary overhead when 
the developer has more specific knowledge about their data model.
   
   ## Describe the solution you'd like
   
   Add support for `fory` struct tags with field metadata:
   
   ```go
   type Foo struct {
       // Field F1: non-nullable (default), no ref tracking (default)
       // Tag ID 0 provides compact encoding in schema evolution mode
       F1 string `fory:"id=0"`
       
       // Field F2: non-nullable (default), no ref tracking (default)
       F2 Bar `fory:"id=1"`
       
       // Field F3: nullable field that may contain nil values
       F3 *string `fory:"id=2,nullable"`
       
       // Field Parent: shared reference that needs tracking (e.g., for 
circular refs)
       Parent *Node `fory:"id=3,ref,nullable"`
       
       // Field with long name: tag ID provides significant space savings
       VeryLongFieldNameThatWouldTakeManyBytes string `fory:"id=4"`
       
       // Explicit opt-out: use field name encoding but get nullable 
optimization
       OptionalField *string `fory:"id=-1,nullable"`
   }
   ```
   
   ### Tag Syntax
   
   ```go
   `fory:"id=<int>[,nullable][,ref]"`
   ```
   
   | Tag Component | Description |
   |---------------|-------------|
   | `id=N` | **Required**. Tag ID for field encoding. N >= 0 uses tag ID; N = 
-1 uses field name |
   | `nullable` | Optional. Field can be nil. Required for pointer types |
   | `ref` | Optional. Enable reference tracking for this field |
   
   ### Design Decision: Required `id`
   
   The `id` tag is **required** when using `fory` struct tags:
   - `id=0` to `id=N`: Use tag ID encoding (compact)
   - `id=-1`: Explicit opt-out, use field name encoding
   
   Rationale:
   1. **Explicit control**: Using `fory` tag means opting into explicit control
   2. **Runtime validation**: Can check for duplicate IDs during registration
   3. **Proven pattern**: Similar to protobuf field numbers, JSON tags
   
   ### Optimization Details
   
   #### 1. Non-nullable (Default) Optimization
   
   When `nullable` is NOT specified:
   - Skip writing the null flag entirely (1 byte saved per field)
   - Directly serialize the field value
   - Panic if pointer field is nil without `nullable` tag
   
   #### 2. No Ref Tracking (Default) Optimization
   
   When `ref` is NOT specified:
   - Skip reference tracking map operations
   - Skip ref flag when combined with non-nullable
   - For pointer types with potential circular refs, add `ref` tag
   
   #### 3. Tag ID Optimization
   
   When `id=N` where N >= 0:
   - Field name encoded as varint instead of meta string
   - Significant space savings for long field names
   
   **Space savings:**
   
   | Field Name | Meta String (approx) | Tag ID |
   |------------|---------------------|--------|
   | `F1` | ~2 bytes | 1 byte |
   | `UserName` | ~6 bytes | 1 byte |
   | `TransactionID` | ~10 bytes | 1 byte |
   
   ### Implementation Notes
   
   1. **Tag Parsing**:
      ```go
      // In fory/type.go or new file fory/field.go
      type FieldMeta struct {
          ID       int
          Nullable bool
          Ref      bool
      }
      
      func parseForygTag(tag string) (*FieldMeta, error) {
          // Parse "id=0,nullable,ref" format
      }
      ```
   
   2. **Registration Integration**:
      ```go
      // During type registration, parse fory tags
      func (f *Fory) RegisterType(t reflect.Type) error {
          for i := 0; i < t.NumField(); i++ {
              field := t.Field(i)
              if tag, ok := field.Tag.Lookup("fory"); ok {
                  meta, err := parseForyTag(tag)
                  // Store metadata for serialization
              }
          }
      }
      ```
   
   3. **Codegen Integration**:
      ```go
      //go:generate fory gen -type Foo
      
      // Generated code respects fory tags
      func (f *Foo) ForyEncode(writer *fory.Writer) error {
          // F1: id=0, non-nullable, no ref
          writer.WriteTagID(0)
          writer.WriteString(f.F1)
          
          // F3: id=2, nullable, no ref
          writer.WriteTagID(2)
          if f.F3 == nil {
              writer.WriteNullFlag()
          } else {
              writer.WriteNotNullFlag()
              writer.WriteString(*f.F3)
          }
          // ...
      }
      ```
   
   4. **Validation**:
      - **Panic** if duplicate tag IDs (>= 0) in same struct
      - **Panic** if `id < -1`
      - **Panic** if pointer field is nil without `nullable` tag
      - **Warning** (or panic in strict mode) if pointer type without `nullable`
   
   ### Compatibility with Existing Tags
   
   The `fory` tag can coexist with other tags:
   
   ```go
   type User struct {
       Name  string `json:"name" fory:"id=0"`
       Email string `json:"email" fory:"id=1"`
       Age   *int   `json:"age,omitempty" fory:"id=2,nullable"`
   }
   ```
   
   ### Performance Impact
   
   For a struct with 10 fields using default settings (non-nullable, no ref 
tracking):
   - **Space savings**: ~20 bytes per object (null + ref flags)
   - **CPU savings**: 10 fewer hash map operations per serialization
   
   For reflection-based serialization:
   - Tag parsing overhead is one-time during registration
   - Runtime serialization uses cached metadata
   
   For codegen-based serialization:
   - Zero runtime overhead for metadata
   - All optimizations applied at code generation time
   
   ## Additional context
   
   This is the Go equivalent of Java's `@ForyField` annotation. See [Java issue 
#3000](https://github.com/apache/fory/issues/3000) for the original design 
discussion.
   
   Protocol spec: 
https://fory.apache.org/docs/specification/fory_xlang_serialization_spec


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to