kitalkuyo-gita commented on issue #2609:
URL: https://github.com/apache/fory/issues/2609#issuecomment-3434797976
Hi @drse, I have read your proposal carefully. In general, it is very
comprehensive. The following is my design proposal.
# Class Registration API Improvement Proposal
## I. Current State Analysis
### 1.1 Current Constraints
In the current implementation:
- **Validation rule**: `classId >= 0 && classId < Short.MAX_VALUE` (i.e.,
0-32766)
- **Reserved ID 0**: `NO_CLASS_ID = 0` indicates unregistered
- **Internal reserved range**: IDs 1-36+ are used for internal types
(primitives, wrappers, common collections, etc.)
- **Internal end marker**: The `innerEndClassId` field records the last ID
of internal class registration (approximately around 150)
### 1.2 Cross-Language Specification Constraints
The cross-language serialization specification explicitly states:
- Internal data types use range 0-64
- Users can use 0-4096 to represent their types
### 1.3 Root Cause
The current `checkRegistration` method's error messages are not clear
enough, failing to explain:
- Which ID ranges are reserved
- Why certain IDs are unavailable
- The recommended ID range to use
## II. Recommended Solution (Hybrid Approach 1 + 2)
After evaluating all proposed solutions, I recommend a **hybrid approach**
combining the advantages of solutions 1 and 2:
### 2.1 Core Design Principles
1. **Backward Compatibility**: Don't break existing APIs and serialization
protocols
2. **Progressive Enhancement**: Improve documentation and validation first,
then consider API evolution
3. **Clear Boundaries**: Clearly define user ID ranges
4. **Friendly Prompts**: Provide detailed error messages and warnings
### 2.2 Implementation Steps
#### Phase One: Immediate Improvements (No Breaking Changes)
**Step 1: Define Constants and Ranges**
Add to `ClassResolver`:
```java
public class ClassResolver implements TypeResolver {
// User-visible constants
public static final short MIN_USER_CLASS_ID = 200;
public static final short MAX_USER_CLASS_ID = 32766; // Short.MAX_VALUE - 1
public static final short RESERVED_ID_START = 0;
public static final short RESERVED_ID_END = 199; // Reserve 200 IDs
// Internal use
private static final String RESERVED_RANGE_MSG =
"Class ID %d is in the reserved range [%d, %d]. " +
"Please use IDs in the range [%d, %d] for user classes.";
private static final String OUT_OF_RANGE_MSG =
"Class ID %d is out of valid range. " +
"Valid range for user classes is [%d, %d].";
}
```
**Step 2: Improve Validation Logic** [4](#1-3)
Modify the `register(Class<?> cls, int classId)` method:
```java
public void register(Class<?> cls, int classId) {
// Improved validation
Preconditions.checkArgument(classId >= 0 && classId < Short.MAX_VALUE,
"Class ID must be in range [0, %d), got: %d", Short.MAX_VALUE, classId);
short id = (short) classId;
// Check if in reserved range
if (id < MIN_USER_CLASS_ID) {
LOG.warn(String.format(RESERVED_RANGE_MSG,
id, RESERVED_ID_START, RESERVED_ID_END, MIN_USER_CLASS_ID,
MAX_USER_CLASS_ID));
}
checkRegistration(cls, id, cls.getName());
// ... rest of code remains unchanged
}
```
**Step 3: Add Helper Methods**
```java
// New public API methods
public static short getMinUserClassId() {
return MIN_USER_CLASS_ID;
}
public static short getMaxUserClassId() {
return MAX_USER_CLASS_ID;
}
public static boolean isValidUserClassId(int classId) {
return classId >= MIN_USER_CLASS_ID && classId <= MAX_USER_CLASS_ID;
}
public static boolean isReservedClassId(int classId) {
return classId >= RESERVED_ID_START && classId < MIN_USER_CLASS_ID;
}
// Improved auto-registration that returns the assigned ID
public short registerAndGetId(Class<?> cls) {
if (!extRegistry.registeredClassIdMap.containsKey(cls)) {
while (extRegistry.classIdGenerator < registeredId2ClassInfo.length
&& registeredId2ClassInfo[extRegistry.classIdGenerator] != null) {
extRegistry.classIdGenerator++;
}
register(cls, extRegistry.classIdGenerator);
return extRegistry.classIdGenerator;
}
return extRegistry.registeredClassIdMap.get(cls);
}
```
**Step 4: Improve Documentation**
Update the `BaseFory` interface documentation:
```java
/**
* Register class with specified id.
*
* <p><b>Important:</b> Class IDs in the range [0, 199] are reserved for
internal use.
* User classes should use IDs in the range [200, 32766].
*
* <p>The method will emit a warning if you register a class with a reserved
ID,
* but it will not fail to maintain backward compatibility.
*
* <p>Use {@link #getMinUserClassId()} and {@link #getMaxUserClassId()} to
get
* the recommended ID range.
*
* @param cls class to register
* @param id class ID, recommended range: [200, 32766]
* @throws IllegalArgumentException if id is negative or >= 32767
*/
void register(Class<?> cls, int id);
```
#### Phase Two: API Evolution (Next Major Version)
**Option A: Add Type-Safe API**
```java
// New API using short type for explicit constraints
public void registerUser(Class<?> cls, short userId) {
Preconditions.checkArgument(
userId >= 0 && userId <= (MAX_USER_CLASS_ID - MIN_USER_CLASS_ID),
"User ID must be in range [0, %d], got: %d",
MAX_USER_CLASS_ID - MIN_USER_CLASS_ID, userId);
int internalId = MIN_USER_CLASS_ID + userId;
register(cls, internalId);
}
```
**Option B: Support Namespaces (If Needed for Cross-Language Scenarios)**
Fory already supports name-based registration, which can be enhanced rather
than introducing a new namespace concept.
### 2.3 Test Plan
Test cases to add:
1. **Boundary Tests**:
- Test ID 0 (should warn)<cite/>
- Test ID 199 (should warn)<cite/>
- Test ID 200 (should succeed)<cite/>
- Test ID 32766 (should succeed)<cite/>
- Test ID 32767 (should fail)<cite/>
2. **Error Message Tests**:
- Verify error messages contain recommended ID ranges<cite/>
- Verify warning logs are correctly recorded<cite/>
3. **Compatibility Tests**:
- Ensure existing code using reserved IDs still works<cite/>
- Ensure serialization/deserialization is compatible with old
versions<cite/>
### 2.4 Migration Guide
For existing users:
```java
// Old code (still works, but with warnings)
fory.register(MyClass.class, 100);
// Recommended new code
fory.register(MyClass.class, 200); // or larger value
// Or use helper methods
if (ClassResolver.isValidUserClassId(myId)) {
fory.register(MyClass.class, myId);
}
```
## III. Evaluation of Other Solutions
### 3.1 Solution 3 (Full Openness) - **Not Recommended**
Allowing users to override all internal registrations would break system
stability and could cause core serialization functionality to fail.<cite/>
### 3.2 Solution 4 (Protocol Modification) - **Not Recommended**
Modifying the serialization protocol would break cross-language
compatibility at too high a cost.
### 3.3 Solution 5 (Namespaces) - **Not Recommended for Now**
Introducing a complex namespace system would significantly increase API
complexity.
The current name-based registration mechanism is sufficient.
### 3.4 Solution 6 (Automatic Encoding) - **Consider for Future Enhancement**
Using Base62/85 encoding to handle out-of-range values is an interesting
idea, but would increase serialization overhead.
It could be considered as an opt-in feature in future versions.
## IV. Implementation Priority
**P0 (Immediate Implementation)**:
1. Add constant definitions (MIN_USER_CLASS_ID, MAX_USER_CLASS_ID)
2. Improve error messages
3. Add warning logs
4. Update documentation
**P1 (Next Minor Version)**:
1. Add helper methods (isValidUserClassId, isReservedClassId)
2. Add registerAndGetId method
3. Complete test cases
**P2 (Next Major Version)**:
1. Consider adding registerUser API
2. Possibly deprecate direct use of reserved ranges
## Notes
**Key Decision Rationale**:
1. **Why choose 200 as the starting point**: Current estimates show
approximately 150 internal registrations, reserving up to 200 provides
sufficient buffer.
2. **Why not modify the protocol**: The cross-language serialization
protocol is already stable, modification costs are extremely high and would
affect Java, Python, Go, Rust, C++ and other language implementations.
3. **Backward compatibility**: Existing tests use IDs like 300-302, and this
code must continue to work.
4. **Progressive improvement principle**: First improve usability through
documentation, warnings, and helper methods, then introduce API changes in
major versions to avoid breaking changes.
This solution balances usability improvements, backward compatibility, and
implementation costs, making it the most pragmatic production-grade solution.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]