rdblue commented on a change in pull request #3482:
URL: https://github.com/apache/iceberg/pull/3482#discussion_r765345115
##########
File path: site/docs/spark-queries.md
##########
@@ -191,57 +192,166 @@ join prod.db.table.snapshots s
on h.snapshot_id = s.snapshot_id
order by made_current_at
```
-```text
-+-------------------------+-----------+----------------+---------------------+----------------------------------+
-| made_current_at | operation | snapshot_id | is_current_ancestor |
summary[spark.app.id] |
-+-------------------------+-----------+----------------+---------------------+----------------------------------+
+
+<div class="markdown-table-container" markdown="block">
+| made_current_at | operation | snapshot_id | is_current_ancestor |
summary[spark.app.id] |
+| -- | -- | -- | -- | -- |
| 2019-02-08 03:29:51.215 | append | 57897183625154 | true |
application_1520379288616_155055 |
| 2019-02-09 16:24:30.13 | delete | 29641004024753 | false |
application_1520379288616_151109 |
| 2019-02-09 16:32:47.336 | append | 57897183625154 | true |
application_1520379288616_155055 |
| 2019-02-08 03:47:55.948 | overwrite | 51792995261850 | true |
application_1520379288616_152431 |
-+-------------------------+-----------+----------------+---------------------+----------------------------------+
-```
+</div>
### Files
-To show a table's data files and each file's metadata, run:
+To show a table's current data files and each file's metadata, run:
```sql
SELECT * FROM prod.db.table.files
```
-```text
-+-------------------------------------------------------------------------+-------------+--------------+--------------------+--------------------+------------------+-------------------+------------------+-----------------+-----------------+--------------+---------------+
-| file_path |
file_format | record_count | file_size_in_bytes | column_sizes |
value_counts | null_value_counts | nan_value_counts | lower_bounds |
upper_bounds | key_metadata | split_offsets |
-+-------------------------------------------------------------------------+-------------+--------------+--------------------+--------------------+------------------+-------------------+------------------+-----------------+-----------------+--------------+---------------+
+
+<div class="markdown-table-container" markdown="block">
+| file_path | file_format | record_count | file_size_in_bytes | column_sizes |
value_counts | null_value_counts | nan_value_counts | lower_bounds |
upper_bounds | key_metadata | split_offsets |
+| -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- |
| s3:/.../table/data/00000-3-8d6d60e8-d427-4809-bcf0-f5d45a4aad96.parquet |
PARQUET | 1 | 597 | [1 -> 90, 2 -> 62] | [1 -> 1,
2 -> 1] | [1 -> 0, 2 -> 0] | [] | [1 -> , 2 -> c] | [1 -> , 2 ->
c] | null | [4] |
| s3:/.../table/data/00001-4-8d6d60e8-d427-4809-bcf0-f5d45a4aad96.parquet |
PARQUET | 1 | 597 | [1 -> 90, 2 -> 62] | [1 -> 1,
2 -> 1] | [1 -> 0, 2 -> 0] | [] | [1 -> , 2 -> b] | [1 -> , 2 ->
b] | null | [4] |
| s3:/.../table/data/00002-5-8d6d60e8-d427-4809-bcf0-f5d45a4aad96.parquet |
PARQUET | 1 | 597 | [1 -> 90, 2 -> 62] | [1 -> 1,
2 -> 1] | [1 -> 0, 2 -> 0] | [] | [1 -> , 2 -> a] | [1 -> , 2 ->
a] | null | [4] |
-+-------------------------------------------------------------------------+-------------+--------------+--------------------+--------------------+------------------+-------------------+------------------+-----------------+-----------------+--------------+---------------+
-```
+</div>
### Manifests
-To show a table's file manifests and each file's metadata, run:
+To show a table's current file manifests and each file's metadata, run:
```sql
SELECT * FROM prod.db.table.manifests
```
-```text
-+----------------------------------------------------------------------+--------+-------------------+---------------------+------------------------+---------------------------+--------------------------+--------------------------------------+
-| path |
length | partition_spec_id | added_snapshot_id | added_data_files_count |
existing_data_files_count | deleted_data_files_count | partition_summaries
|
-+----------------------------------------------------------------------+--------+-------------------+---------------------+------------------------+---------------------------+--------------------------+--------------------------------------+
+
+<div class="markdown-table-container" markdown="block">
+| path | length | partition_spec_id | added_snapshot_id |
added_data_files_count | existing_data_files_count | deleted_data_files_count |
partition_summaries |
+| -- | -- | -- | -- | -- | -- | -- | -- |
| s3://.../table/metadata/45b5290b-ee61-4788-b324-b1e2735c0e10-m0.avro | 4479
| 0 | 6668963634911763636 | 8 | 0
| 0 |
[[false,null,2019-05-13,2019-05-15]] |
-+----------------------------------------------------------------------+--------+-------------------+---------------------+------------------------+---------------------------+--------------------------+--------------------------------------+
+</div>
+
+!!! Note
+ 1. Fields within `partition_summaries` column of the manifests table
correspond to `field_summary` structs within [manifest
list](./spec.md#manifest-lists), with the following order:
+ - `contains_null`
+ - `contains_nan`
+ - `lower_bound`
+ - `upper_bound`
+ 2. `contains_nan` could return null, which indicates that this information
is not available from files' metadata.
+ This usually occurs when reading from V1 table, where `contains_nan`
is not populated.
+
+### Partitions
+
+To show a table's current partitions
+
+```sql
+SELECT * FROM prod.db.table.partitions
```
-Note:
-1. Fields within `partition_summaries` column of the manifests table
correspond to `field_summary` structs within [manifest
list](./spec.md#manifest-lists), with the following order:
- - `contains_null`
- - `contains_nan`
- - `lower_bound`
- - `upper_bound`
-2. `contains_nan` could return null, which indicates that this information is
not available from files' metadata.
- This usually occurs when reading from V1 table, where `contains_nan` is not
populated.
+<div class="markdown-table-container" markdown="block">
+| partition | record_count | file_count |
+| -- | -- | -- |
+| {20211001, 11}| 1| 1|
+| {20211002, 11}| 1| 1|
+| {20211001, 10}| 1| 1|
+| {20211002, 10}| 1| 1|
+</div>
+
+### All Metadata
+
+!!! Note
+ The metadata is readable from any snapshot currently tracked by the table
+
+!!! WARNING
+ The table's metadata may return **duplicate** rows
Review comment:
This sentence is misleading. It should say something like `The "all"
metadata tables may produce more than one row per data file or manifest file
because metadata files may be used in more than one table snapshot`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]