This is an automated email from the ASF dual-hosted git repository.

yihua pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/hudi.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new d591b08aa09 [DOCS] Add explanation for deduplicate command (#10498)
d591b08aa09 is described below

commit d591b08aa09512089ba16e640eccdc771e180fc1
Author: Santhosh Kumar M <[email protected]>
AuthorDate: Fri Mar 1 13:18:52 2024 +0530

    [DOCS] Add explanation for deduplicate command (#10498)
    
    Co-authored-by: Y Ethan Guo <[email protected]>
---
 website/docs/procedures.md                          | 6 +++---
 website/versioned_docs/version-0.13.1/procedures.md | 6 +++---
 website/versioned_docs/version-0.14.0/procedures.md | 6 +++---
 website/versioned_docs/version-0.14.1/procedures.md | 6 +++---
 4 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/website/docs/procedures.md b/website/docs/procedures.md
index 10c3ee853ec..49e8de775a5 100644
--- a/website/docs/procedures.md
+++ b/website/docs/procedures.md
@@ -1753,7 +1753,7 @@ call repair_corrupted_clean_files(table => 
'test_hudi_table');
 
 ### repair_deduplicate
 
-Repair deduplicate records for a hudi table.
+Repair deduplicate records for a hudi table. The job dedupliates the data in 
the duplicated_partition_path and writes it into repaired_output_path. In the 
end of the job, the data in repaired_output_path is copied into the original 
path (duplicated_partition_path).
 
 **Input**
 
@@ -1774,12 +1774,12 @@ Repair deduplicate records for a hudi table.
 **Example**
 
 ```
-call repair_deduplicate(table => 'test_hudi_table', duplicated_partition_path 
=> 'dt=2021-05-03', repaired_output_path => 'dt=2021-05-04');
+call repair_deduplicate(table => 'test_hudi_table', duplicated_partition_path 
=> 'dt=2021-05-03', repaired_output_path => '/tmp/repair_path/');
 ```
 
 | result                                       | 
 |----------------------------------------------|
-| Reduplicated files placed in: dt=2021-05-04. | 
+| Reduplicated files placed in: /tmp/repair_path/. | 
 
 ### repair_migrate_partition_meta
 
diff --git a/website/versioned_docs/version-0.13.1/procedures.md 
b/website/versioned_docs/version-0.13.1/procedures.md
index 1144efc8d52..dbaf9f36acd 100644
--- a/website/versioned_docs/version-0.13.1/procedures.md
+++ b/website/versioned_docs/version-0.13.1/procedures.md
@@ -1645,7 +1645,7 @@ call repair_corrupted_clean_files(table => 
'test_hudi_table');
 
 ### repair_deduplicate
 
-Repair deduplicate records for a hudi table.
+Repair deduplicate records for a hudi table. The job dedupliates the data in 
the duplicated_partition_path and writes it into repaired_output_path. In the 
end of the job, the data in repaired_output_path is copied into the original 
path (duplicated_partition_path).
 
 **Input**
 
@@ -1666,12 +1666,12 @@ Repair deduplicate records for a hudi table.
 **Example**
 
 ```
-call repair_deduplicate(table => 'test_hudi_table', duplicated_partition_path 
=> 'dt=2021-05-03', repaired_output_path => 'dt=2021-05-04');
+call repair_deduplicate(table => 'test_hudi_table', duplicated_partition_path 
=> 'dt=2021-05-03', repaired_output_path => '/tmp/repair_path/');
 ```
 
 | result                                       | 
 |----------------------------------------------|
-| Reduplicated files placed in: dt=2021-05-04. | 
+| Reduplicated files placed in: /tmp/repair_path/. | 
 
 ### repair_migrate_partition_meta
 
diff --git a/website/versioned_docs/version-0.14.0/procedures.md 
b/website/versioned_docs/version-0.14.0/procedures.md
index 21d0ab901e1..ec8dea5ae56 100644
--- a/website/versioned_docs/version-0.14.0/procedures.md
+++ b/website/versioned_docs/version-0.14.0/procedures.md
@@ -1693,7 +1693,7 @@ call repair_corrupted_clean_files(table => 
'test_hudi_table');
 
 ### repair_deduplicate
 
-Repair deduplicate records for a hudi table.
+Repair deduplicate records for a hudi table. The job dedupliates the data in 
the duplicated_partition_path and writes it into repaired_output_path. In the 
end of the job, the data in repaired_output_path is copied into the original 
path (duplicated_partition_path).
 
 **Input**
 
@@ -1714,12 +1714,12 @@ Repair deduplicate records for a hudi table.
 **Example**
 
 ```
-call repair_deduplicate(table => 'test_hudi_table', duplicated_partition_path 
=> 'dt=2021-05-03', repaired_output_path => 'dt=2021-05-04');
+call repair_deduplicate(table => 'test_hudi_table', duplicated_partition_path 
=> 'dt=2021-05-03', repaired_output_path => '/tmp/repair_path/');
 ```
 
 | result                                       | 
 |----------------------------------------------|
-| Reduplicated files placed in: dt=2021-05-04. | 
+| Reduplicated files placed in: /tmp/repair_path/. | 
 
 ### repair_migrate_partition_meta
 
diff --git a/website/versioned_docs/version-0.14.1/procedures.md 
b/website/versioned_docs/version-0.14.1/procedures.md
index 80bbb23a5b5..c913db17de2 100644
--- a/website/versioned_docs/version-0.14.1/procedures.md
+++ b/website/versioned_docs/version-0.14.1/procedures.md
@@ -1753,7 +1753,7 @@ call repair_corrupted_clean_files(table => 
'test_hudi_table');
 
 ### repair_deduplicate
 
-Repair deduplicate records for a hudi table.
+Repair deduplicate records for a hudi table. The job dedupliates the data in 
the duplicated_partition_path and writes it into repaired_output_path. In the 
end of the job, the data in repaired_output_path is copied into the original 
path (duplicated_partition_path).
 
 **Input**
 
@@ -1774,12 +1774,12 @@ Repair deduplicate records for a hudi table.
 **Example**
 
 ```
-call repair_deduplicate(table => 'test_hudi_table', duplicated_partition_path 
=> 'dt=2021-05-03', repaired_output_path => 'dt=2021-05-04');
+call repair_deduplicate(table => 'test_hudi_table', duplicated_partition_path 
=> 'dt=2021-05-03', repaired_output_path => '/tmp/repair_path/');
 ```
 
 | result                                       | 
 |----------------------------------------------|
-| Reduplicated files placed in: dt=2021-05-04. | 
+| Reduplicated files placed in: /tmp/repair_path/. | 
 
 ### repair_migrate_partition_meta
 

Reply via email to