This is an automated email from the ASF dual-hosted git repository.
chaokunyang pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/incubator-fury.git
The following commit(s) were added to refs/heads/main by this push:
new 93800888 feat(sepc): update type meta field info spec (#1603)
93800888 is described below
commit 93800888595065b2690fec093ab0cbfd6ac7dedc
Author: Shawn Yang <[email protected]>
AuthorDate: Mon May 6 22:56:27 2024 +0800
feat(sepc): update type meta field info spec (#1603)
## What does this PR do?
Update type meta field info spec:
```
- field info:
- header(8
bits): `3 bits size + 2 bits field name encoding + polymorphism flag
+ nullability flag + ref tracking flag`.
Users can use annotation to provide those info.
- 2 bits field name encoding:
- encoding:
`UTF8/ALL_TO_LOWER_SPECIAL/LOWER_UPPER_DIGIT_SPECIAL/TAG_ID`
- If tag id is used, i.e. field name is written by an unsigned
varint tag id. 2 bits encoding will be `11`.
- size of field name:
- The `3 bits size: 0~7` will be used to indicate length
`1~7`, the value `7` indicates to read more bytes,
the encoding will encode `size - 7` as a varint next.
- If encoding is `TAG_ID`, then num_bytes of field name will be
used to store tag id.
- ref tracking: when set to 1, ref tracking will be enabled for
this field.
- nullability: when set to 1, this field can be null.
- polymorphism: when set to 1, the actual type of field will be the
declared field type even the type if
not `final`.
- field name: If tag id is set, tag id will be used instead. Otherwise
meta string encoding `[length]` and data will
be written instead.
```
## Related issues
#1556
## Does this PR introduce any user-facing change?
<!--
If any user-facing interface changes, please [open an
issue](https://github.com/apache/incubator-fury/issues/new/choose)
describing the need to do so and update the document if necessary.
-->
- [ ] Does this PR introduce any public API change?
- [ ] Does this PR introduce any binary protocol compatibility change?
## Benchmark
<!--
When the PR has an impact on performance (if you don't know whether the
PR will have an impact on performance, you can submit the PR first, and
if it will have impact on performance, the code reviewer will explain
it), be sure to attach a benchmark data here.
-->
---
docs/specification/xlang_serialization_spec.md | 34 ++++++++++++++------------
1 file changed, 19 insertions(+), 15 deletions(-)
diff --git a/docs/specification/xlang_serialization_spec.md
b/docs/specification/xlang_serialization_spec.md
index 0583e7e0..7b46b556 100644
--- a/docs/specification/xlang_serialization_spec.md
+++ b/docs/specification/xlang_serialization_spec.md
@@ -288,23 +288,27 @@ Meta header is a 64 bits number value encoded in little
endian order.
fields, then use fields info in meta for deserializing compatible fields.
- type id: the registered id for the current type, which will be written as an
unsigned varint.
- field info:
- - Header(8 bits):
- - Format:
- - `reserved 1 bit + 3 bits field name encoding + polymorphism flag
+ nullability flag + ref tracking flag + tag id flag`.
- - Users can use annotation to provide that info.
- - tag id: when set to 1, the field name will be written by an
unsigned varint tag id.
- - ref tracking: when set to 0, ref tracking will be disabled for
this field.
- - nullability: when set to 0, this field won't be null.
- - polymorphism: when set to 1, the actual type of field will be
the declared field type even the type if
- not `final`.
- - 3 bits field name encoding will be set to meta string encoding
flags when tag id is not set.
- - Type id:
+ - header(8
+ bits): `3 bits size + 2 bits field name encoding + polymorphism flag +
nullability flag + ref tracking flag`.
+ Users can use annotation to provide those info.
+ - 2 bits field name encoding:
+ - encoding:
`UTF8/ALL_TO_LOWER_SPECIAL/LOWER_UPPER_DIGIT_SPECIAL/TAG_ID`
+ - If tag id is used, i.e. field name is written by an unsigned
varint tag id. 2 bits encoding will be `11`.
+ - size of field name:
+ - The `3 bits size: 0~7` will be used to indicate length `1~7`,
the value `7` indicates to read more bytes,
+ the encoding will encode `size - 7` as a varint next.
+ - If encoding is `TAG_ID`, then num_bytes of field name will be
used to store tag id.
+ - ref tracking: when set to 1, ref tracking will be enabled for this
field.
+ - nullability: when set to 1, this field can be null.
+ - polymorphism: when set to 1, the actual type of field will be the
declared field type even the type if
+ not `final`.
+ - field name: If tag id is set, tag id will be used instead. Otherwise
meta string encoding `[length]` and data will
+ be written instead.
+ - type id:
- For registered type-consistent classes, it will be the registered
type id.
- Otherwise it will be encoded as `OBJECT_ID` if it isn't `final` and
`FINAL_OBJECT_ID` if it's `final`. The
meta for such types is written separately instead of inlining here
is to reduce meta space cost if object of
- this type is serialized in the current object graph multiple times,
and the field value may be null too.
- - Field name: If tag id is set, tag id will be used instead. Otherwise
meta string encoding length and data will
- be written instead.
+ this type is serialized in current object graph multiple times, and
the field value may be null too.
Field order are left as implementation details, which is not exposed to
specification, the deserialization need to
resort fields based on Fury field comparator. In this way, fury can compute
statistics for field names or types and
@@ -473,7 +477,7 @@ which will be encoded by elements header, each use one bit:
By default, all bits are unset, which means all elements won't track ref, all
elements are same type, not null and
the actual element is the declared type in the custom type field.
-The implementation can generate different deserialization code based read
header, and look up the generated code from
+The implementation can generate different deserialization code based read
header, and look up the generated code from
a linear map/list.
#### elements data
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]