gene-db commented on code in PR #475:
URL: https://github.com/apache/parquet-format/pull/475#discussion_r1870179241


##########
VariantEncoding.md:
##########
@@ -88,9 +88,9 @@ metadata  |        header         |
           +-----------------------+
 ```
 
-The metadata is encoded first with the `header` byte, then `dictionary_size` 
which is a little-endian value of `offset_size` bytes, and represents the 
number of string values in the dictionary.
+The metadata is encoded first with the `header` byte, then `dictionary_size` 
which is a unsigned little-endian value of `offset_size` bytes, and represents 
the number of string values in the dictionary.
 Next, is an `offset` list, which contains `dictionary_size + 1` values.
-Each `offset` is a little-endian value of `offset_size` bytes, and represents 
the starting byte offset of the i-th string in `bytes`.
+Each `offset` is a usigned little-endian value of `offset_size` bytes, and 
represents the starting byte offset of the i-th string in `bytes`.

Review Comment:
   ```suggestion
   Each `offset` is an unsigned little-endian value of `offset_size` bytes, and 
represents the starting byte offset of the i-th string in `bytes`.
   ```



##########
VariantEncoding.md:
##########
@@ -69,17 +69,17 @@ The entire metadata is encoded as the following diagram 
shows:
 metadata  |        header         |
           +-----------------------+
           |                       |
-          :    dictionary_size    :  <-- little-endian, `offset_size` bytes
+          :    dictionary_size    :  <-- unsigned little-endian, `offset_size` 
bytes
           |                       |
           +-----------------------+
           |                       |
-          :        offset         :  <-- little-endian, `offset_size` bytes
+          :        offset         :  <--  unsigned little-endian, 
`offset_size` bytes

Review Comment:
   NIT:
   ```suggestion
             :        offset         :  <-- unsigned little-endian, 
`offset_size` bytes
   ```



##########
VariantEncoding.md:
##########
@@ -88,9 +88,9 @@ metadata  |        header         |
           +-----------------------+
 ```
 
-The metadata is encoded first with the `header` byte, then `dictionary_size` 
which is a little-endian value of `offset_size` bytes, and represents the 
number of string values in the dictionary.
+The metadata is encoded first with the `header` byte, then `dictionary_size` 
which is a unsigned little-endian value of `offset_size` bytes, and represents 
the number of string values in the dictionary.

Review Comment:
   ```suggestion
   The metadata is encoded first with the `header` byte, then `dictionary_size` 
which is an unsigned little-endian value of `offset_size` bytes, and represents 
the number of string values in the dictionary.
   ```



##########
VariantEncoding.md:
##########
@@ -69,17 +69,17 @@ The entire metadata is encoded as the following diagram 
shows:
 metadata  |        header         |
           +-----------------------+
           |                       |
-          :    dictionary_size    :  <-- little-endian, `offset_size` bytes
+          :    dictionary_size    :  <-- unsigned little-endian, `offset_size` 
bytes
           |                       |
           +-----------------------+
           |                       |
-          :        offset         :  <-- little-endian, `offset_size` bytes
+          :        offset         :  <--  unsigned little-endian, 
`offset_size` bytes
           |                       |
           +-----------------------+
                       :
           +-----------------------+
           |                       |
-          :        offset         :  <-- little-endian, `offset_size` bytes
+          :        offset         :  <--  unsigned little-endian, 
`offset_size` bytes

Review Comment:
   NIT:
   ```suggestion
             :        offset         :  <-- unsigned little-endian, 
`offset_size` bytes
   ```



##########
VariantEncoding.md:
##########
@@ -313,10 +313,10 @@ array value_data  |                       |
                   |                       |
                   +-----------------------+
 ```
-An array `value_data` begins with `num_elements`, a 1-byte or 4-byte 
little-endian value, representing the number of elements in the array.
+An array `value_data` begins with `num_elements`, a 1-byte or 4-byte unsigned 
little-endian value, representing the number of elements in the array.
 The size in bytes of `num_elements` is indicated by `is_large` in the 
`value_header`.
 Next, is a `field_offset` list.
-There are `num_elements + 1` number of entries and each `field_offset` is a 
little-endian value of `field_offset_size` bytes.
+There are `num_elements + 1` number of entries and each `field_offset` is a 
unsigned little-endian value of `field_offset_size` bytes.

Review Comment:
   ```suggestion
   There are `num_elements + 1` number of entries and each `field_offset` is an 
unsigned little-endian value of `field_offset_size` bytes.
   ```



##########
VariantEncoding.md:
##########
@@ -254,13 +254,13 @@ object value_data  |                       |
                    |                       |
                    +-----------------------+
 ```
-An object `value_data` begins with `num_elements`, a 1-byte or 4-byte 
little-endian value, representing the number of elements in the object.
+An object `value_data` begins with `num_elements`, a 1-byte or 4-byte unsigned 
little-endian value, representing the number of elements in the object.
 The size in bytes of `num_elements` is indicated by `is_large` in the 
`value_header`.
 Next, is a list of `field_id` values.
-There are `num_elements` number of entries and each `field_id` is a 
little-endian value of `field_id_size` bytes.
+There are `num_elements` number of entries and each `field_id` is a unsigned 
little-endian value of `field_id_size` bytes.
 A `field_id` is an index into the dictionary in the metadata.
 The `field_id` list is followed by a `field_offset` list.
-There are `num_elements + 1` number of entries and each `field_offset` is a 
little-endian value of `field_offset_size` bytes.
+There are `num_elements + 1` number of entries and each `field_offset` is a 
unsigned little-endian value of `field_offset_size` bytes.

Review Comment:
   ```suggestion
   There are `num_elements + 1` number of entries and each `field_offset` is an 
unsigned little-endian value of `field_offset_size` bytes.
   ```



##########
VariantEncoding.md:
##########
@@ -254,13 +254,13 @@ object value_data  |                       |
                    |                       |
                    +-----------------------+
 ```
-An object `value_data` begins with `num_elements`, a 1-byte or 4-byte 
little-endian value, representing the number of elements in the object.
+An object `value_data` begins with `num_elements`, a 1-byte or 4-byte unsigned 
little-endian value, representing the number of elements in the object.
 The size in bytes of `num_elements` is indicated by `is_large` in the 
`value_header`.
 Next, is a list of `field_id` values.
-There are `num_elements` number of entries and each `field_id` is a 
little-endian value of `field_id_size` bytes.
+There are `num_elements` number of entries and each `field_id` is a unsigned 
little-endian value of `field_id_size` bytes.

Review Comment:
   ```suggestion
   There are `num_elements` number of entries and each `field_id` is an 
unsigned little-endian value of `field_id_size` bytes.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@parquet.apache.org
For additional commands, e-mail: issues-h...@parquet.apache.org

Reply via email to