rdblue commented on a change in pull request #4301:
URL: https://github.com/apache/iceberg/pull/4301#discussion_r823204918



##########
File path: format/spec.md
##########
@@ -193,10 +193,38 @@ Notes:
 
 For details on how to serialize a schema to JSON, see Appendix C.
 
+#### Default value
+Default values can be assigned to top-level columns or nested fields. Default 
values are used during schema evolution when adding a new column. The default 
value is used to read rows belonging to the files that lack the column or 
nested field prior to the schema evolution.
+
+Currently, when a default value for a column or nested field is set, it is 
considered an incompatible change to change it to another value. However, 
changing default values is allowed when calling the `allowIncompatibleChanges` 
API explicitly. Changing default values is discouraged since the occurrence of 
a rewrite may determine whether the new or old default value is returned during 
the read.
+
+Default values are encoded in JSON format. The representation depends on the 
type of the corresponding field. The mapping of types and their corresponding 
default value JSON representation is described in the following table.
+
+| type               | json type     | example                                
| note                                                                          
                                                                                
                                                                               |
+|--------------------|---------------|----------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| **`boolean`**      | **`boolean`** | true                                   
|                                                                               
                                                                                
                                                                               |
+| **`int`**          | **`number`**  | 1                                      
|                                                                               
                                                                                
                                                                               |
+| **`long`**         | **`number`**  | 1                                      
|                                                                               
                                                                                
                                                                               |
+| **`float`**        | **`number`**  | 1.1                                    
|                                                                               
                                                                                
                                                                               |
+| **`double`**       | **`number`**  | 1.1                                    
|                                                                               
                                                                                
                                                                               |
+| **`decimal(P,S)`** | **`string`**  | "0x3162"                               
| we use hexadecimal byte literals to encode bytes, with prefix `0x`, the byte 
array contain the two's-complement representation of the `unscaled` integer 
value in big-endian byte order, the actual value will be `unscaled * 10 ^ 
(-scale)` |
+| **`date`**         | **`number`**  | 19054                                  
|                                                                               
                                                                                
                                                                               |
+| **`time`**         | **`number`**  | 36000000000                            
|                                                                               
                                                                                
                                                                               |
+| **`timestamp`**    | **`number`**  | 1646277378000000                       
|                                                                               
                                                                                
                                                                               |
+| **`timestamptz`**  | **`number`**  | 1646277378000000                       
|                                                                               
                                                                                
                                                                               |
+| **`string`**       | **`string`**  | "foo"                                  
|                                                                               
                                                                                
                                                                               |
+| **`uuid`**         | **`string`**  | "eb26bdb1-a1d8-4aa6-990e-da940875492c" 
|                                                                               
                                                                                
                                                                               |
+| **`fixed(L)`**     | **`string`**  | "0x3162"                               
| we use hexadecimal byte literals to encode bytes, with prefix `0x`            
                                                                                
                                                                               |
+| **`binary`**       | **`string`**  | "0x3162"                               
| we use hexadecimal byte literals to encode bytes, with prefix `0x`            
                                                                                
                                                                               |
+| **`struct`**       | **`object`**  | {"a": 1, "foo": "bar"}                 
| the keys are the nested fields' names in the struct schema, and the values 
should be corresponding value literals of the field's type                      
                                                                                
  |
+| **`list`**         | **`array`**   | \[1, 2, 3\]                            
| each value should be value literals of the corresponding element type of the 
list                                                                            
                                                                                
|
+| **`map`**          | **`object`**  | {"a": 1, "b": 2}                       
| the key and value type correspond with the schema of the map, the key (json 
string) are parsed into the resulting key type of the map schema, an exception 
will be thrown if parsing fails.                                                
  |

Review comment:
       This encoding doesn't work for non-string keys. I recommend using an 
array of arrays that is interpreted as a list of key/value pairs.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to