rdblue commented on a change in pull request #4301: URL: https://github.com/apache/iceberg/pull/4301#discussion_r823199766
##########
File path: format/spec.md
##########
@@ -193,10 +193,38 @@ Notes:
For details on how to serialize a schema to JSON, see Appendix C.
+#### Default value
+Default values can be assigned to top-level columns or nested fields. Default
values are used during schema evolution when adding a new column. The default
value is used to read rows belonging to the files that lack the column or
nested field prior to the schema evolution.
+
+Currently, when a default value for a column or nested field is set, it is
considered an incompatible change to change it to another value. However,
changing default values is allowed when calling the `allowIncompatibleChanges`
API explicitly. Changing default values is discouraged since the occurrence of
a rewrite may determine whether the new or old default value is returned during
the read.
+
+Default values are encoded in JSON format. The representation depends on the
type of the corresponding field. The mapping of types and their corresponding
default value JSON representation is described in the following table.
+
+| type | json type | example
| note
|
+|--------------------|---------------|----------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| **`boolean`** | **`boolean`** | true
|
|
+| **`int`** | **`number`** | 1
|
|
+| **`long`** | **`number`** | 1
|
|
+| **`float`** | **`number`** | 1.1
|
|
+| **`double`** | **`number`** | 1.1
|
|
+| **`decimal(P,S)`** | **`string`** | "0x3162"
| we use hexadecimal byte literals to encode bytes, with prefix `0x`, the byte
array contain the two's-complement representation of the `unscaled` integer
value in big-endian byte order, the actual value will be `unscaled * 10 ^
(-scale)` |
+| **`date`** | **`number`** | 19054
|
|
+| **`time`** | **`number`** | 36000000000
|
|
+| **`timestamp`** | **`number`** | 1646277378000000
|
|
+| **`timestamptz`** | **`number`** | 1646277378000000
|
|
+| **`string`** | **`string`** | "foo"
|
|
+| **`uuid`** | **`string`** | "eb26bdb1-a1d8-4aa6-990e-da940875492c"
|
|
+| **`fixed(L)`** | **`string`** | "0x3162"
| we use hexadecimal byte literals to encode bytes, with prefix `0x`
|
Review comment:
All text, including notes or column headings, should use sentence case.
That is, the first letter should be capitalized. There is no need for complete
sentences; be as direct and concise as possible while being clear.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
