rdblue commented on a change in pull request #3482:
URL: https://github.com/apache/iceberg/pull/3482#discussion_r744894279
##########
File path: site/docs/spark-queries.md
##########
@@ -191,57 +187,140 @@ join prod.db.table.snapshots s
on h.snapshot_id = s.snapshot_id
order by made_current_at
```
-```text
-+-------------------------+-----------+----------------+---------------------+----------------------------------+
-| made_current_at | operation | snapshot_id | is_current_ancestor |
summary[spark.app.id] |
-+-------------------------+-----------+----------------+---------------------+----------------------------------+
+| made_current_at | operation | snapshot_id | is_current_ancestor |
summary[spark.app.id] |
+| -- | -- | -- | -- | -- |
| 2019-02-08 03:29:51.215 | append | 57897183625154 | true |
application_1520379288616_155055 |
| 2019-02-09 16:24:30.13 | delete | 29641004024753 | false |
application_1520379288616_151109 |
| 2019-02-09 16:32:47.336 | append | 57897183625154 | true |
application_1520379288616_155055 |
| 2019-02-08 03:47:55.948 | overwrite | 51792995261850 | true |
application_1520379288616_152431 |
-+-------------------------+-----------+----------------+---------------------+----------------------------------+
-```
+
### Files
-To show a table's data files and each file's metadata, run:
+To show a table's current data files and each file's metadata, run:
```sql
SELECT * FROM prod.db.table.files
```
-```text
-+-------------------------------------------------------------------------+-------------+--------------+--------------------+--------------------+------------------+-------------------+------------------+-----------------+-----------------+--------------+---------------+
-| file_path |
file_format | record_count | file_size_in_bytes | column_sizes |
value_counts | null_value_counts | nan_value_counts | lower_bounds |
upper_bounds | key_metadata | split_offsets |
-+-------------------------------------------------------------------------+-------------+--------------+--------------------+--------------------+------------------+-------------------+------------------+-----------------+-----------------+--------------+---------------+
+| file_path | file_format | record_count | file_size_in_bytes | column_sizes |
value_counts | null_value_counts | nan_value_counts | lower_bounds |
upper_bounds | key_metadata | split_offsets |
+| -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- |
| s3:/.../table/data/00000-3-8d6d60e8-d427-4809-bcf0-f5d45a4aad96.parquet |
PARQUET | 1 | 597 | [1 -> 90, 2 -> 62] | [1 -> 1,
2 -> 1] | [1 -> 0, 2 -> 0] | [] | [1 -> , 2 -> c] | [1 -> , 2 ->
c] | null | [4] |
| s3:/.../table/data/00001-4-8d6d60e8-d427-4809-bcf0-f5d45a4aad96.parquet |
PARQUET | 1 | 597 | [1 -> 90, 2 -> 62] | [1 -> 1,
2 -> 1] | [1 -> 0, 2 -> 0] | [] | [1 -> , 2 -> b] | [1 -> , 2 ->
b] | null | [4] |
| s3:/.../table/data/00002-5-8d6d60e8-d427-4809-bcf0-f5d45a4aad96.parquet |
PARQUET | 1 | 597 | [1 -> 90, 2 -> 62] | [1 -> 1,
2 -> 1] | [1 -> 0, 2 -> 0] | [] | [1 -> , 2 -> a] | [1 -> , 2 ->
a] | null | [4] |
-+-------------------------------------------------------------------------+-------------+--------------+--------------------+--------------------+------------------+-------------------+------------------+-----------------+-----------------+--------------+---------------+
-```
### Manifests
-To show a table's file manifests and each file's metadata, run:
+To show a table's current file manifests and each file's metadata, run:
```sql
SELECT * FROM prod.db.table.manifests
```
-```text
-+----------------------------------------------------------------------+--------+-------------------+---------------------+------------------------+---------------------------+--------------------------+--------------------------------------+
-| path |
length | partition_spec_id | added_snapshot_id | added_data_files_count |
existing_data_files_count | deleted_data_files_count | partition_summaries
|
-+----------------------------------------------------------------------+--------+-------------------+---------------------+------------------------+---------------------------+--------------------------+--------------------------------------+
+| path | length | partition_spec_id | added_snapshot_id |
added_data_files_count | existing_data_files_count | deleted_data_files_count |
partition_summaries |
+| -- | -- | -- | -- | -- | -- | -- | -- |
| s3://.../table/metadata/45b5290b-ee61-4788-b324-b1e2735c0e10-m0.avro | 4479
| 0 | 6668963634911763636 | 8 | 0
| 0 |
[[false,null,2019-05-13,2019-05-15]] |
-+----------------------------------------------------------------------+--------+-------------------+---------------------+------------------------+---------------------------+--------------------------+--------------------------------------+
+
+!!! Note
+ 1. Fields within `partition_summaries` column of the manifests table
correspond to `field_summary` structs within [manifest
list](./spec.md#manifest-lists), with the following order:
+ - `contains_null`
+ - `contains_nan`
+ - `lower_bound`
+ - `upper_bound`
+ 2. `contains_nan` could return null, which indicates that this information
is not available from files' metadata.
+ This usually occurs when reading from V1 table, where `contains_nan`
is not populated.
+
+### Partitions
+
+To show a table's current partitions
+
+```sql
+SELECT * FROM prod.db.table.partitions
+```
+
+| partition | record_count | file_count |
+| -- | -- | -- |
+| {20211001, 11}| 1| 1|
+| {20211002, 11}| 1| 1|
+| {20211001, 10}| 1| 1|
+| {20211002, 10}| 1| 1|
+
+### Entries
+
+To show a table's current manifest entries as rows, for both delete and data
files, run:
+
+```sql
+SELECT * FROM prod.db.table.entries
+```
+
+| status|snapshot_id | sequence_number | data_file |
+| -- | -- | -- | -- |
+|0 |7462238160765527919|0 |{0,
s3://.../dt=20211001/age=10/00000-0-38e4b886-b445-40a9-8db0-58653a331aba-00001.parquet,
PARQUET, {20211001, 10}, 1, 1132, {1 -> 47, 2 -> 52, 3 -> 47, 4 -> 55}, {1 ->
1, 2 -> 1, 3 -> 1, 4 -> 1}, {1 -> 0, 2 -> 0, 3 -> 0, 4 -> 0}, {}, {1 ->