This is an automated email from the ASF dual-hosted git repository.
chaokunyang pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/incubator-fury-site.git
The following commit(s) were added to refs/heads/main by this push:
new 6fd68e6 add metastring spec link to metastring blog (#123)
6fd68e6 is described below
commit 6fd68e642904e8adc6d3b8463d7c8cfb1489a104
Author: Shawn Yang <[email protected]>
AuthorDate: Tue May 7 17:26:43 2024 +0800
add metastring spec link to metastring blog (#123)
---
blog/2024-05-06-metastring-space-efficient_encoding_for_string.md | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/blog/2024-05-06-metastring-space-efficient_encoding_for_string.md
b/blog/2024-05-06-metastring-space-efficient_encoding_for_string.md
index 550bfdd..9afee9e 100644
--- a/blog/2024-05-06-metastring-space-efficient_encoding_for_string.md
+++ b/blog/2024-05-06-metastring-space-efficient_encoding_for_string.md
@@ -14,7 +14,7 @@ will take one byte for every char, which is not space
efficient actually.
If we take a deeper look, we will found that most chars are **lowercase chars,
`.`, `$` and `_`**, which can be expressed in a much
smaller range **`0~32`**. But one byte can represent range `0~255`, the
significant bits are wasted, and this cost is not ignorable. In a dynamic
serialization
-framework, such meta will take considerable cost compared to real data.
+framework, such meta will take considerable cost compared to actual data.
So we proposed a new string encoding algorithm which we called **meta string
encoding** in Fury. It will encode most chars using `5` bits instead of `8`
bits in utf-8 encoding, which can bring **37.5% space cost savings** compared
to utf-8 encoding.
@@ -26,6 +26,7 @@ Such a string is enumerated and limited, so the encoding
performance is not impo
Meta string encoding uses `5/6` bits instead of `8` bits in utf-8 encoding for
every chars. Since it uses less bits than utf8, it can bring
**37.5% space cost savings** compared to utf-8 and has a smaller encoded
binary size, which uses less storage and makes the network transfer faster.
+More details about meta string spec can be found in [Fury xlang serialization
specification](https://fury.apache.org/docs/specification/fury_xlang_serialization_spec/#meta-string).
## Encoding Algorithms
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]