harshmotw-db commented on code in PR #47473:
URL: https://github.com/apache/spark/pull/47473#discussion_r1690573509


##########
common/variant/README.md:
##########
@@ -335,27 +335,29 @@ The Decimal type contains a scale, but no precision. The 
implied precision of a
 | Object       | `2` | A collection of (string-key, variant-value) pairs |
 | Array        | `3` | An ordered sequence of variant values             |
 
-| Primitive Type              | Type ID | Equivalent Parquet Type   | Binary 
format                                                                          
                   |
-|-----------------------------|---------|---------------------------|-----------------------------------------------------------------------------------------------------------|
-| null                        | `0`     | any                       | none     
                                                                                
                 |
-| boolean (True)              | `1`     | BOOLEAN                   | none     
                                                                                
                 |
-| boolean (False)             | `2`     | BOOLEAN                   | none     
                                                                                
                 |
-| int8                        | `3`     | INT(8, signed)            | 1 byte   
                                                                                
                 |
-| int16                       | `4`     | INT(16, signed)           | 2 byte 
little-endian                                                                   
                   |
-| int32                       | `5`     | INT(32, signed)           | 4 byte 
little-endian                                                                   
                   |
-| int64                       | `6`     | INT(64, signed)           | 8 byte 
little-endian                                                                   
                   |
-| double                      | `7`     | DOUBLE                    | IEEE 
little-endian                                                                   
                     |
-| decimal4                    | `8`     | DECIMAL(precision, scale) | 1 byte 
scale in range [0, 38], followed by little-endian unscaled value (see decimal 
table)               |
-| decimal8                    | `9`     | DECIMAL(precision, scale) | 1 byte 
scale in range [0, 38], followed by little-endian unscaled value (see decimal 
table)               |
-| decimal16                   | `10`    | DECIMAL(precision, scale) | 1 byte 
scale in range [0, 38], followed by little-endian unscaled value (see decimal 
table)               |
-| date                        | `11`    | DATE                      | 4 byte 
little-endian                                                                   
                   |
-| timestamp                   | `12`    | TIMESTAMP(true, MICROS)   | 8-byte 
little-endian                                                                   
                   |
-| timestamp without time zone | `13`    | TIMESTAMP(false, MICROS)  | 8-byte 
little-endian                                                                   
                   |
-| float                       | `14`    | FLOAT                     | IEEE 
little-endian                                                                   
                     |
-| binary                      | `15`    | BINARY                    | 4 byte 
little-endian size, followed by bytes                                           
                   |
-| string                      | `16`    | STRING                    | 4 byte 
little-endian size, followed by UTF-8 encoded bytes                             
                   |
-| binary from metadata        | `17`    | BINARY                    | 
Little-endian index into the metadata dictionary. Number of bytes is equal to 
the metadata `offset_size`. |
-| string from metadata        | `18`    | STRING                    | 
Little-endian index into the metadata dictionary. Number of bytes is equal to 
the metadata `offset_size`. |
+| Primitive Type              | Type ID | Equivalent Parquet Type              
         | Binary format                                                        
                                               |
+|-----------------------------|---------|-----------------------------------------------|---------------------------------------------------------------------------------------------------------------------|
+| null                        | `0`     | any                                  
         | none                                                                 
                                               |
+| boolean (True)              | `1`     | BOOLEAN                              
         | none                                                                 
                                               |
+| boolean (False)             | `2`     | BOOLEAN                              
         | none                                                                 
                                               |
+| int8                        | `3`     | INT(8, signed)                       
         | 1 byte                                                               
                                               |
+| int16                       | `4`     | INT(16, signed)                      
         | 2 byte little-endian                                                 
                                               |
+| int32                       | `5`     | INT(32, signed)                      
         | 4 byte little-endian                                                 
                                               |
+| int64                       | `6`     | INT(64, signed)                      
         | 8 byte little-endian                                                 
                                               |
+| double                      | `7`     | DOUBLE                               
         | IEEE little-endian                                                   
                                               |
+| decimal4                    | `8`     | DECIMAL(precision, scale)            
         | 1 byte scale in range [0, 38], followed by little-endian unscaled 
value (see decimal table)                         |
+| decimal8                    | `9`     | DECIMAL(precision, scale)            
         | 1 byte scale in range [0, 38], followed by little-endian unscaled 
value (see decimal table)                         |
+| decimal16                   | `10`    | DECIMAL(precision, scale)            
         | 1 byte scale in range [0, 38], followed by little-endian unscaled 
value (see decimal table)                         |
+| date                        | `11`    | DATE                                 
         | 4 byte little-endian                                                 
                                               |
+| timestamp                   | `12`    | TIMESTAMP(true, MICROS)              
         | 8-byte little-endian                                                 
                                               |
+| timestamp without time zone | `13`    | TIMESTAMP(false, MICROS)             
         | 8-byte little-endian                                                 
                                               |
+| float                       | `14`    | FLOAT                                
         | IEEE little-endian                                                   
                                               |
+| binary                      | `15`    | BINARY                               
         | 4 byte little-endian size, followed by bytes                         
                                               |
+| string                      | `16`    | STRING                               
         | 4 byte little-endian size, followed by UTF-8 encoded bytes           
                                               |
+| binary from metadata        | `17`    | BINARY                               
         | Little-endian index into the metadata dictionary. Number of bytes is 
equal to the metadata `offset_size`.           |
+| string from metadata        | `18`    | STRING                               
         | Little-endian index into the metadata dictionary. Number of bytes is 
equal to the metadata `offset_size`.           |
+| year-month interval         | `19`    | YearMonthIntervalType(start_field, 
end_field) | 1 byte denoting start field (1 bit) and end field (1 bit) starting 
at LSB followed by 4-byte little-endian value.   |

Review Comment:
   I had mistakenly put in the equivalent spark types here earlier. I have 
removed the parquet types for now as I am investigating the parquet types.
   
   The details about the start and end field are in a paragraph after this 
table in this PR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to