This is an automated email from the ASF dual-hosted git repository.
lzljs3620320 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/paimon.git
The following commit(s) were added to refs/heads/master by this push:
new 4a89dd9bf [doc] Document remove_orphan_files whole database
4a89dd9bf is described below
commit 4a89dd9bffdba1e31f05c86b7a174f7160871c34
Author: Jingsong <[email protected]>
AuthorDate: Fri Jul 5 22:02:06 2024 +0800
[doc] Document remove_orphan_files whole database
---
docs/content/flink/procedures.md | 5 +++--
docs/content/maintenance/manage-snapshots.md | 16 ++++++++++------
docs/content/spark/procedures.md | 3 ++-
3 files changed, 15 insertions(+), 9 deletions(-)
diff --git a/docs/content/flink/procedures.md b/docs/content/flink/procedures.md
index 3adccb7ab..fc9d48d67 100644
--- a/docs/content/flink/procedures.md
+++ b/docs/content/flink/procedures.md
@@ -179,14 +179,15 @@ All available procedures are listed below.
</td>
<td>
To remove the orphan data files and metadata files. Arguments:
- <li>identifier: the target table identifier. Cannot be empty.</li>
+ <li>identifier: the target table identifier. Cannot be empty, you
can use database_name.* to clean whole database.</li>
<li>olderThan: to avoid deleting newly written files, this
procedure only
deletes orphan files older than 1 day by default. This argument
can modify the interval.
</li>
<li>dryRun: when true, view only orphan files, don't actually
remove files. Default is false.</li>
</td>
<td>CALL remove_orphan_files('default.T', '2023-10-31
12:00:00')<br/><br/>
- CALL remove_orphan_files('default.T', '2023-10-31 12:00:00', true)
+ CALL remove_orphan_files('default.*', '2023-10-31
12:00:00')<br/><br/>
+ CALL remove_orphan_files('default.T', '2023-10-31 12:00:00', true)
</td>
</tr>
<tr>
diff --git a/docs/content/maintenance/manage-snapshots.md
b/docs/content/maintenance/manage-snapshots.md
index a974b0da9..f7fb94d01 100644
--- a/docs/content/maintenance/manage-snapshots.md
+++ b/docs/content/maintenance/manage-snapshots.md
@@ -296,7 +296,15 @@ submit a `remove_orphan_files` job to clean them:
{{< tabs "remove_orphan_files" >}}
-{{< tab "Flink" >}}
+{{< tab "Spark SQL/Flink SQL" >}}
+```sql
+CALL sys.remove_orphan_files(table => "my_db.my_table", [older_than =>
"2023-10-31 12:00:00"])
+
+CALL sys.remove_orphan_files(table => "my_db.*", [older_than => "2023-10-31
12:00:00"])
+```
+{{< /tab >}}
+
+{{< tab "Flink Action" >}}
```bash
<FLINK_HOME>/bin/flink run \
@@ -322,12 +330,8 @@ To avoid deleting files that are newly added by other
writing jobs, this action
--older_than '2023-10-31 12:00:00'
```
-{{< /tab >}}
+The table can be `*` to clean all tables in the database.
-{{< tab "Spark" >}}
-```sql
-CALL sys.remove_orphan_files(table => "tableId", [older_than => "2023-10-31
12:00:00"])
-```
{{< /tab >}}
{{< /tabs >}}
\ No newline at end of file
diff --git a/docs/content/spark/procedures.md b/docs/content/spark/procedures.md
index fdd41e077..56a8abb61 100644
--- a/docs/content/spark/procedures.md
+++ b/docs/content/spark/procedures.md
@@ -129,12 +129,13 @@ This section introduce all available spark procedures
about paimon.
<td>remove_orphan_files</td>
<td>
To remove the orphan data files and metadata files. Arguments:
- <li>table: the target table identifier. Cannot be empty.</li>
+ <li>table: the target table identifier. Cannot be empty, you can
use database_name.* to clean whole database.</li>
<li>older_than: to avoid deleting newly written files, this
procedure only deletes orphan files older than 1 day by default. This argument
can modify the interval.</li>
<li>dry_run: when true, view only orphan files, don't actually
remove files. Default is false.</li>
</td>
<td>
CALL sys.remove_orphan_files(table => 'default.T', older_than =>
'2023-10-31 12:00:00')<br/><br/>
+ CALL sys.remove_orphan_files(table => 'default.*', older_than =>
'2023-10-31 12:00:00')<br/><br/>
CALL sys.remove_orphan_files(table => 'default.T', older_than =>
'2023-10-31 12:00:00', dry_run => true)
</td>
</tr>