(incubator-fury-site) branch main updated: 🔄 synced local 'docs/specification/' with remote 'docs/specification/'

chaokunyang Mon, 15 Apr 2024 04:34:21 -0700

This is an automated email from the ASF dual-hosted git repository.

chaokunyang pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/incubator-fury-site.git



The following commit(s) were added to refs/heads/main by this push:
     new 11388b1  🔄 synced local 'docs/specification/' with remote 
'docs/specification/'
11388b1 is described below

commit 11388b10bda33c7d55d050fe89642c551d98ae41
Author: chaokunyang <[email protected]>
AuthorDate: Mon Apr 15 11:34:07 2024 +0000

    🔄 synced local 'docs/specification/' with remote 'docs/specification/'
---
 docs/specification/java_serialization_spec.md  | 37 +++++++++++++++-----------
 docs/specification/xlang_serialization_spec.md | 35 +++++++++++++-----------
 2 files changed, 40 insertions(+), 32 deletions(-)

diff --git a/docs/specification/java_serialization_spec.md 
b/docs/specification/java_serialization_spec.md
index a5d0bec..b05af49 100644
--- a/docs/specification/java_serialization_spec.md
+++ b/docs/specification/java_serialization_spec.md
@@ -3,6 +3,7 @@ title: Fury Java Serialization Format
 sidebar_position: 1
 id: fury_java_serialization_spec
 ---
+
 # Fury Java Serialization Specification
 
 ## Spec overview
@@ -222,25 +223,29 @@ Meta string is mainly used to encode meta strings such as 
class name and field n
 
 String binary encoding algorithm:
 
-| Algorithm                 | Pattern        | Description                     
                                                                                
                                 |
-|---------------------------|----------------|--------------------------------------------------------------------------------------------------------------------------------------------------|
-| LOWER_SPECIAL             | `a-z._$\|`     | every char is written using 5 
bits, `a-z`: `0b00000~0b11001`, `._$\|`: `0b11010~0b11101`                      
                                   |
-| LOWER_UPPER_DIGIT_SPECIAL | `a-zA-Z0~9._$` | every char is written using 6 
bits, `a-z`: `0b00000~0b11110`, `A-Z`: `0b11010~0b110011`, `0~9`: 
`0b110100~0b111101`, `._$`: `0b111110~0b1000000` |
-| UTF-8                     | any chars      | UTF-8 encoding                  
                                                                                
                                 |
+| Algorithm                 | Pattern            | Description                 
                                                                                
                                                                      |
+|---------------------------|--------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| LOWER_SPECIAL             | `a-z._$\|`         | every char is written using 
5 bits, `a-z`: `0b00000~0b11001`, `._$\|`: `0b11010~0b11101`                    
                                                                      |
+| LOWER_UPPER_DIGIT_SPECIAL | `a-zA-Z0~9[c1,c2]` | every char is written using 
6 bits, `a-z`: `0b00000~0b11001`, `A-Z`: `0b11010~0b110011`, `0~9`: 
`0b110100~0b111101`, `c1,c2`: `0b111110~0b111111`, `c1,c2` should be two of 
`._$` |
+| UTF-8                     | any chars          | UTF-8 encoding              
                                                                                
                                                                      |
 
 Encoding flags:
 
-| Encoding Flag             | Pattern                                          
         | Encoding Algorithm                                                   
                                                               |
-|---------------------------|-----------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------|
-| LOWER_SPECIAL             | every char is in `a-z._$\|`                      
         | `LOWER_SPECIAL`                                                      
                                                               |
-| REP_FIRST_LOWER_SPECIAL   | every char is in `a-z._$` except first char is 
upper case | replace first upper case char to lower case, then use 
`LOWER_SPECIAL`                                                               |
-| REP_MUL_LOWER_SPECIAL     | every char is in `a-zA-Z._$`                     
         | replace every upper case char by `\|` + `lower case`, then use 
`LOWER_SPECIAL`, use this encoding if it's smaller than Encoding `3` |
-| LOWER_UPPER_DIGIT_SPECIAL | every char is in `a-zA-Z._$`                     
         | use `LOWER_UPPER_DIGIT_SPECIAL` encoding if it's smaller than 
Encoding `2`                                                          |
-| UTF8                      | any utf-8 char                                   
         | use `UTF-8` encoding                                                 
                                                               |
-| Compression               | any utf-8 char                                   
         | lossless compression                                                 
                                                               |
-
-Depending on cases, one can choose encoding `flags + data` jointly, uses 3 
bits of first byte for flags and other bytes
-for data.
+| Encoding Flag             | Pattern                                          
             | Encoding Algorithm                                               
                                                                                
           |
+|---------------------------|---------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| LOWER_SPECIAL             | every char is in `a-z._$\|`                      
             | `LOWER_SPECIAL`                                                  
                                                                                
           |
+| FIRST_TO_LOWER_SPECIAL    | every char is in `a-z[c1,c2]` except first char 
is upper case | replace first upper case char to lower case, then use 
`LOWER_SPECIAL`                                                                 
                      |
+| ALL_TO_LOWER_SPECIAL      | every char is in `a-zA-Z[c1,c2]`                 
             | replace every upper case char by `\|` + `lower case`, then use 
`LOWER_SPECIAL`, use this encoding if it's smaller than Encoding 
`LOWER_UPPER_DIGIT_SPECIAL` |
+| LOWER_UPPER_DIGIT_SPECIAL | every char is in `a-zA-Z[c1,c2]`                 
             | use `LOWER_UPPER_DIGIT_SPECIAL` encoding if it's smaller than 
Encoding `FIRST_TO_LOWER_SPECIAL`                                               
              |
+| UTF8                      | any utf-8 char                                   
             | use `UTF-8` encoding                                             
                                                                                
           |
+| Compression               | any utf-8 char                                   
             | lossless compression                                             
                                                                                
           |
+
+Notes:
+
+- For package name encoding, `c1,c2` should be `._`; For field/type name 
encoding, `c1,c2` should be `_$`;
+- Depending on cases, one can choose encoding `flags + data` jointly, uses 3 
bits of first byte for flags and other
+  bytes
+  for data.
 
 ### Shared meta string
 
diff --git a/docs/specification/xlang_serialization_spec.md 
b/docs/specification/xlang_serialization_spec.md
index 4641d2b..dd8c672 100644
--- a/docs/specification/xlang_serialization_spec.md
+++ b/docs/specification/xlang_serialization_spec.md
@@ -338,25 +338,28 @@ Meta string is mainly used to encode meta strings such as 
field names.
 
 String binary encoding algorithm:
 
-| Algorithm                 | Pattern        | Description                     
                                                                                
                                 |
-|---------------------------|----------------|--------------------------------------------------------------------------------------------------------------------------------------------------|
-| LOWER_SPECIAL             | `a-z._$\|`     | every char is written using 5 
bits, `a-z`: `0b00000~0b11001`, `._$\|`: `0b11010~0b11101`                      
                                   |
-| LOWER_UPPER_DIGIT_SPECIAL | `a-zA-Z0~9._$` | every char is written using 6 
bits, `a-z`: `0b00000~0b11110`, `A-Z`: `0b11010~0b110011`, `0~9`: 
`0b110100~0b111101`, `._$`: `0b111110~0b1000000` |
-| UTF-8                     | any chars      | UTF-8 encoding                  
                                                                                
                                 |
+| Algorithm                 | Pattern       | Description                      
                                                                                
                              |
+|---------------------------|---------------|------------------------------------------------------------------------------------------------------------------------------------------------|
+| LOWER_SPECIAL             | `a-z._$\|`    | every char is written using 5 
bits, `a-z`: `0b00000~0b11001`, `._$\|`: `0b11010~0b11101`                      
                                 |
+| LOWER_UPPER_DIGIT_SPECIAL | `a-zA-Z0~9._` | every char is written using 6 
bits, `a-z`: `0b00000~0b11001`, `A-Z`: `0b11010~0b110011`, `0~9`: 
`0b110100~0b111101`, `._`: `0b111110~0b111111` |
+| UTF-8                     | any chars     | UTF-8 encoding                   
                                                                                
                              |
 
 Encoding flags:
 
-| Encoding Flag             | Pattern                                          
         | Encoding Algorithm                                                   
                                                               |
-|---------------------------|-----------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------|
-| LOWER_SPECIAL             | every char is in `a-z._$\|`                      
         | `LOWER_SPECIAL`                                                      
                                                               |
-| REP_FIRST_LOWER_SPECIAL   | every char is in `a-z._$` except first char is 
upper case | replace first upper case char to lower case, then use 
`LOWER_SPECIAL`                                                               |
-| REP_MUL_LOWER_SPECIAL     | every char is in `a-zA-Z._$`                     
         | replace every upper case char by `\|` + `lower case`, then use 
`LOWER_SPECIAL`, use this encoding if it's smaller than Encoding `3` |
-| LOWER_UPPER_DIGIT_SPECIAL | every char is in `a-zA-Z._$`                     
         | use `LOWER_UPPER_DIGIT_SPECIAL` encoding if it's smaller than 
Encoding `2`                                                          |
-| UTF8                      | any utf-8 char                                   
         | use `UTF-8` encoding                                                 
                                                               |
-| Compression               | any utf-8 char                                   
         | lossless compression                                                 
                                                               |
-
-Depending on cases, one can choose encoding `flags + data` jointly, uses 3 
bits of first byte for flags and other bytes
-for data.
+| Encoding Flag             | Pattern                                          
        | Encoding Algorithm                                                    
                                                                                
      |
+|---------------------------|----------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| LOWER_SPECIAL             | every char is in `a-z._\|`                       
        | `LOWER_SPECIAL`                                                       
                                                                                
      |
+| FIRST_TO_LOWER_SPECIAL    | every char is in `a-z._` except first char is 
upper case | replace first upper case char to lower case, then use 
`LOWER_SPECIAL`                                                                 
                      |
+| ALL_TO_LOWER_SPECIAL      | every char is in `a-zA-Z._`                      
        | replace every upper case char by `\|` + `lower case`, then use 
`LOWER_SPECIAL`, use this encoding if it's smaller than Encoding 
`LOWER_UPPER_DIGIT_SPECIAL` |
+| LOWER_UPPER_DIGIT_SPECIAL | every char is in `a-zA-Z._`                      
        | use `LOWER_UPPER_DIGIT_SPECIAL` encoding if it's smaller than 
Encoding `FIRST_TO_LOWER_SPECIAL`                                               
              |
+| UTF8                      | any utf-8 char                                   
        | use `UTF-8` encoding                                                  
                                                                                
      |
+| Compression               | any utf-8 char                                   
        | lossless compression                                                  
                                                                                
      |
+
+Notes:
+
+- Depending on cases, one can choose encoding `flags + data` jointly, uses 3 
bits of first byte for flags and other
+  bytes
+  for data.
 
 ## Value Format
 


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(incubator-fury-site) branch main updated: 🔄 synced local 'docs/specification/' with remote 'docs/specification/'

Reply via email to