kbendick commented on a change in pull request #3284: URL: https://github.com/apache/iceberg/pull/3284#discussion_r727450562
########## File path: site/docs/spec.md ########## @@ -379,7 +379,7 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo | _required_ | _required_ | **`100 file_path`** | `string` | Full URI for the file with FS scheme | | _required_ | _required_ | **`101 file_format`** | `string` | String file format name, avro, orc or parquet | | _required_ | _required_ | **`102 partition`** | `struct<...>` | Partition data tuple, schema based on the partition spec output using partition field ids for the struct field ids | -| _required_ | _required_ | **`103 record_count`** | `long` | Number of records in this file | +| _required_ | _required_ | **`103 record_count`** | `long` with special value: `-1: Record count unknown` | Number of records in this file. | Review comment: I did that initially, but there are a few places that place it under description. However, those are arguably placed under `Type` and not `Description` because they're enums, whereas this is just one possible magic value (so I personally agree that `description` is the better column for this). That's mostly why I made this a draft actually 😅 Here's an example of the values being in `Type`: https://github.com/apache/iceberg/blob/86350db9507d6b0d19643beae25de0a758b6d3cb/site/docs/spec.md#L488 As well as here: https://github.com/apache/iceberg/blob/86350db9507d6b0d19643beae25de0a758b6d3cb/site/docs/spec.md#L374-L378 ########## File path: site/docs/spec.md ########## @@ -379,7 +379,7 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo | _required_ | _required_ | **`100 file_path`** | `string` | Full URI for the file with FS scheme | | _required_ | _required_ | **`101 file_format`** | `string` | String file format name, avro, orc or parquet | | _required_ | _required_ | **`102 partition`** | `struct<...>` | Partition data tuple, schema based on the partition spec output using partition field ids for the struct field ids | -| _required_ | _required_ | **`103 record_count`** | `long` | Number of records in this file | +| _required_ | _required_ | **`103 record_count`** | `long` with special value: `-1: Record count unknown` | Number of records in this file. | Review comment: But since this is just a caveat on one magic value, I do tend to agree with you. I'm open to either. Given it's the spec, I figured I'd let others weigh in. ########## File path: site/docs/spec.md ########## @@ -379,7 +379,7 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo | _required_ | _required_ | **`100 file_path`** | `string` | Full URI for the file with FS scheme | | _required_ | _required_ | **`101 file_format`** | `string` | String file format name, avro, orc or parquet | | _required_ | _required_ | **`102 partition`** | `struct<...>` | Partition data tuple, schema based on the partition spec output using partition field ids for the struct field ids | -| _required_ | _required_ | **`103 record_count`** | `long` | Number of records in this file | +| _required_ | _required_ | **`103 record_count`** | `long` with special value: `-1: Record count unknown` | Number of records in this file. | Review comment: I updated it to be in `Description` @szehon-ho. I copied language similar to an entry a few lines down. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org