This is an automated email from the ASF dual-hosted git repository.

stigahuang pushed a commit to branch branch-4.4.1
in repository https://gitbox.apache.org/repos/asf/impala.git

commit 53ee6536ca9b57dac2482f4a68f36c494c025513
Author: Daniel Becker <[email protected]>
AuthorDate: Thu May 2 15:02:28 2024 +0200

    IMPALA-13036: Document Iceberg metadata tables
    
    This change adds documentation on how Iceberg metadata tables can be
    used.
    
    Testing:
     - built docs locally
    
    Change-Id: Ic453f567b814cb4363a155e2008029e94efb6ed1
    Reviewed-on: http://gerrit.cloudera.org:8080/21387
    Tested-by: Impala Public Jenkins <[email protected]>
    Reviewed-by: Peter Rozsa <[email protected]>
---
 docs/topics/impala_iceberg.xml | 72 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 72 insertions(+)

diff --git a/docs/topics/impala_iceberg.xml b/docs/topics/impala_iceberg.xml
index 4cc95503a..0c0ce344a 100644
--- a/docs/topics/impala_iceberg.xml
+++ b/docs/topics/impala_iceberg.xml
@@ -716,6 +716,78 @@ ALTER TABLE ice_tbl EXECUTE expire_snapshots(now() - 
interval 5 days);
     </conbody>
   </concept>
 
+  <concept id="iceberg_metadata_tables">
+    <title>Iceberg metadata tables</title>
+    <conbody>
+      <p>
+        Iceberg stores extensive metadata for each table (e.g. snapshots, 
manifests, data
+        and delete files etc.), which is accessible in Impala in the form of 
virtual
+        tables called metadata tables.
+      </p>
+      <p>
+        Metadata tables can be queried just like regular tables, including 
filtering,
+        aggregation and joining with other metadata and regular tables. On the 
other hand,
+        they are read-only, so it is not possible to change, add or remove 
records from
+        them, they cannot be dropped and new metadata tables cannot be 
created. Metadata
+        changes made in other ways (not through metadata tables) are reflected 
in the
+        tables.
+      </p>
+      <p>
+        To list the metadata tables available for an Iceberg table, use the 
<codeph>SHOW
+        METADATA TABLES</codeph> command:
+
+        <codeblock>
+SHOW METADATA TABLES IN [db.]tbl [[LIKE] “pattern”]
+        </codeblock>
+
+        It is possible to filter the result using <codeph>pattern</codeph>. 
All Iceberg
+        tables have the same metadata tables, so this command is mostly for 
convenience.
+        Using <codeph>SHOW METADATA TABLES</codeph> on a non-Iceberg table 
results in an
+        error.
+      </p>
+      <p>
+        Just like regular tables, metadata tables have schemas that can be 
queried with
+        the <codeph>DESCRIBE</codeph> command. Note, however, that 
<codeph>DESCRIBE
+        FORMATTED|EXTENDED</codeph> are not available for metadata tables.
+      </p>
+      <p>
+        Example:
+        <codeblock>
+DESCRIBE functional_parquet.iceberg_alltypes_part.history;
+        </codeblock>
+      </p>
+      <p>
+        To retrieve information from metadata tables, use the usual
+        <codeph>SELECT</codeph> statement. You can select any subset of the 
columns or all
+        of them using ‘*’. Note that in contrast to regular tables, 
<codeph>SELECT
+        *</codeph> on metadata tables always includes complex-typed columns in 
the result.
+        Therefore, the query option <codeph>EXPAND_COMPLEX_TYPES</codeph> only 
applies to
+        regular tables. This holds also in queries that mix metadata tables 
and regular
+        tables: for <codeph>SELECT *</codeph> expressions from metadata 
tables, complex
+        types will always be included, and for <codeph>SELECT *</codeph> 
expressions from
+        regular tables, complex types will be included if and only if
+        <codeph>EXPAND_COMPLEX_TYPES</codeph> is true.
+      </p>
+      <p>
+        Note that unnesting collections from metadata tables is not supported.
+      </p>
+      <p>
+        Example:
+        <codeblock>
+SELECT
+    s.operation,
+    h.is_current_ancestor,
+    s.summary
+FROM functional_parquet.iceberg_alltypes_part.history h
+JOIN functional_parquet.iceberg_alltypes_part.snapshots s
+  ON h.snapshot_id = s.snapshot_id
+WHERE s.operation = 'append'
+ORDER BY made_current_at;
+        </codeblock>
+      </p>
+    </conbody>
+  </concept>
+
   <concept id="iceberg_table_cloning">
     <title>Cloning Iceberg tables (LIKE clause)</title>
     <conbody>

Reply via email to