kitalkuyo-gita commented on issue #2609:
URL: https://github.com/apache/fory/issues/2609#issuecomment-3434797976

   Hi @drse, I have read your proposal carefully. In general, it is very 
comprehensive. The following is my design proposal.
   
   #  Class Registration API Improvement Proposal
   
   ## I. Current State Analysis
   
   ### 1.1 Current Constraints
   
   In the current implementation:
   - **Validation rule**: `classId >= 0 && classId < Short.MAX_VALUE` (i.e., 
0-32766)
   - **Reserved ID 0**: `NO_CLASS_ID = 0` indicates unregistered
   - **Internal reserved range**: IDs 1-36+ are used for internal types 
(primitives, wrappers, common collections, etc.)
   - **Internal end marker**: The `innerEndClassId` field records the last ID 
of internal class registration (approximately around 150)
   
   ### 1.2 Cross-Language Specification Constraints
   
   The cross-language serialization specification explicitly states:
   - Internal data types use range 0-64
   - Users can use 0-4096 to represent their types
   
   ### 1.3 Root Cause 
   
   The current `checkRegistration` method's error messages are not clear 
enough, failing to explain:
   - Which ID ranges are reserved
   - Why certain IDs are unavailable
   - The recommended ID range to use
   
   ## II. Recommended Solution (Hybrid Approach 1 + 2)
   
   After evaluating all proposed solutions, I recommend a **hybrid approach** 
combining the advantages of solutions 1 and 2:
   
   ### 2.1 Core Design Principles
   
   1. **Backward Compatibility**: Don't break existing APIs and serialization 
protocols
   2. **Progressive Enhancement**: Improve documentation and validation first, 
then consider API evolution
   3. **Clear Boundaries**: Clearly define user ID ranges
   4. **Friendly Prompts**: Provide detailed error messages and warnings
   
   ### 2.2 Implementation Steps
   
   #### Phase One: Immediate Improvements (No Breaking Changes)
   
   **Step 1: Define Constants and Ranges**
   
   Add to `ClassResolver`:
   
   ```java
   public class ClassResolver implements TypeResolver {
     // User-visible constants
     public static final short MIN_USER_CLASS_ID = 200;
     public static final short MAX_USER_CLASS_ID = 32766; // Short.MAX_VALUE - 1
     public static final short RESERVED_ID_START = 0;
     public static final short RESERVED_ID_END = 199; // Reserve 200 IDs
     
     // Internal use
     private static final String RESERVED_RANGE_MSG = 
       "Class ID %d is in the reserved range [%d, %d]. " +
       "Please use IDs in the range [%d, %d] for user classes.";
     private static final String OUT_OF_RANGE_MSG = 
       "Class ID %d is out of valid range. " +
       "Valid range for user classes is [%d, %d].";
   }
   ```
   
   **Step 2: Improve Validation Logic** [4](#1-3) 
   
   Modify the `register(Class<?> cls, int classId)` method:
   
   ```java
   public void register(Class<?> cls, int classId) {
     // Improved validation
     Preconditions.checkArgument(classId >= 0 && classId < Short.MAX_VALUE,
       "Class ID must be in range [0, %d), got: %d", Short.MAX_VALUE, classId);
     
     short id = (short) classId;
     
     // Check if in reserved range
     if (id < MIN_USER_CLASS_ID) {
       LOG.warn(String.format(RESERVED_RANGE_MSG, 
         id, RESERVED_ID_START, RESERVED_ID_END, MIN_USER_CLASS_ID, 
MAX_USER_CLASS_ID));
     }
     
     checkRegistration(cls, id, cls.getName());
     // ... rest of code remains unchanged
   }
   ```
   
   **Step 3: Add Helper Methods**
   
   ```java
   // New public API methods
   public static short getMinUserClassId() {
     return MIN_USER_CLASS_ID;
   }
   
   public static short getMaxUserClassId() {
     return MAX_USER_CLASS_ID;
   }
   
   public static boolean isValidUserClassId(int classId) {
     return classId >= MIN_USER_CLASS_ID && classId <= MAX_USER_CLASS_ID;
   }
   
   public static boolean isReservedClassId(int classId) {
     return classId >= RESERVED_ID_START && classId < MIN_USER_CLASS_ID;
   }
   
   // Improved auto-registration that returns the assigned ID
   public short registerAndGetId(Class<?> cls) {
     if (!extRegistry.registeredClassIdMap.containsKey(cls)) {
       while (extRegistry.classIdGenerator < registeredId2ClassInfo.length
           && registeredId2ClassInfo[extRegistry.classIdGenerator] != null) {
         extRegistry.classIdGenerator++;
       }
       register(cls, extRegistry.classIdGenerator);
       return extRegistry.classIdGenerator;
     }
     return extRegistry.registeredClassIdMap.get(cls);
   }
   ```
   
   **Step 4: Improve Documentation**
   
   Update the `BaseFory` interface documentation:
   
   ```java
   /**
    * Register class with specified id. 
    * 
    * <p><b>Important:</b> Class IDs in the range [0, 199] are reserved for 
internal use.
    * User classes should use IDs in the range [200, 32766].
    * 
    * <p>The method will emit a warning if you register a class with a reserved 
ID,
    * but it will not fail to maintain backward compatibility.
    * 
    * <p>Use {@link #getMinUserClassId()} and {@link #getMaxUserClassId()} to 
get
    * the recommended ID range.
    * 
    * @param cls class to register
    * @param id class ID, recommended range: [200, 32766]
    * @throws IllegalArgumentException if id is negative or >= 32767
    */
   void register(Class<?> cls, int id);
   ```
   
   #### Phase Two: API Evolution (Next Major Version)
   
   **Option A: Add Type-Safe API**
   
   ```java
   // New API using short type for explicit constraints
   public void registerUser(Class<?> cls, short userId) {
     Preconditions.checkArgument(
       userId >= 0 && userId <= (MAX_USER_CLASS_ID - MIN_USER_CLASS_ID),
       "User ID must be in range [0, %d], got: %d", 
       MAX_USER_CLASS_ID - MIN_USER_CLASS_ID, userId);
     
     int internalId = MIN_USER_CLASS_ID + userId;
     register(cls, internalId);
   }
   ```
   
   **Option B: Support Namespaces (If Needed for Cross-Language Scenarios)** 
   
   Fory already supports name-based registration, which can be enhanced rather 
than introducing a new namespace concept.
   
   ### 2.3 Test Plan
   
   Test cases to add:
   
   1. **Boundary Tests**:
      - Test ID 0 (should warn)<cite/>
      - Test ID 199 (should warn)<cite/>
      - Test ID 200 (should succeed)<cite/>
      - Test ID 32766 (should succeed)<cite/>
      - Test ID 32767 (should fail)<cite/>
   
   2. **Error Message Tests**:
      - Verify error messages contain recommended ID ranges<cite/>
      - Verify warning logs are correctly recorded<cite/>
   
   3. **Compatibility Tests**:
      - Ensure existing code using reserved IDs still works<cite/>
      - Ensure serialization/deserialization is compatible with old 
versions<cite/>
   
   ### 2.4 Migration Guide
   
   For existing users:
   
   ```java
   // Old code (still works, but with warnings)
   fory.register(MyClass.class, 100);
   
   // Recommended new code
   fory.register(MyClass.class, 200); // or larger value
   
   // Or use helper methods
   if (ClassResolver.isValidUserClassId(myId)) {
     fory.register(MyClass.class, myId);
   }
   ```
   
   ## III. Evaluation of Other Solutions
   
   ### 3.1 Solution 3 (Full Openness) - **Not Recommended**
   Allowing users to override all internal registrations would break system 
stability and could cause core serialization functionality to fail.<cite/>
   
   ### 3.2 Solution 4 (Protocol Modification) - **Not Recommended**
   Modifying the serialization protocol would break cross-language 
compatibility at too high a cost.
   
   ### 3.3 Solution 5 (Namespaces) - **Not Recommended for Now**
   Introducing a complex namespace system would significantly increase API 
complexity.
   The current name-based registration mechanism is sufficient.
   
   ### 3.4 Solution 6 (Automatic Encoding) - **Consider for Future Enhancement**
   Using Base62/85 encoding to handle out-of-range values is an interesting 
idea, but would increase serialization overhead.
   It could be considered as an opt-in feature in future versions.
   
   ## IV. Implementation Priority
   
   **P0 (Immediate Implementation)**:
   1. Add constant definitions (MIN_USER_CLASS_ID, MAX_USER_CLASS_ID)
   2. Improve error messages
   3. Add warning logs
   4. Update documentation
   
   **P1 (Next Minor Version)**:
   1. Add helper methods (isValidUserClassId, isReservedClassId)
   2. Add registerAndGetId method
   3. Complete test cases
   
   **P2 (Next Major Version)**:
   1. Consider adding registerUser API
   2. Possibly deprecate direct use of reserved ranges
   
   ## Notes
   
   **Key Decision Rationale**:
   
   1. **Why choose 200 as the starting point**: Current estimates show 
approximately 150 internal registrations, reserving up to 200 provides 
sufficient buffer.
   
   2. **Why not modify the protocol**: The cross-language serialization 
protocol is already stable, modification costs are extremely high and would 
affect Java, Python, Go, Rust, C++ and other language implementations.
   
   3. **Backward compatibility**: Existing tests use IDs like 300-302, and this 
code must continue to work.
   
   4. **Progressive improvement principle**: First improve usability through 
documentation, warnings, and helper methods, then introduce API changes in 
major versions to avoid breaking changes.
   
   This solution balances usability improvements, backward compatibility, and 
implementation costs, making it the most pragmatic production-grade solution.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to