RussellSpitzer commented on a change in pull request #3375:
URL: https://github.com/apache/iceberg/pull/3375#discussion_r737537081



##########
File path: site/docs/spark-procedures.md
##########
@@ -240,6 +240,34 @@ Remove any files in the `tablelocation/data` folder which 
are not known to the t
 CALL catalog_name.system.remove_orphan_files(table => 'db.sample', location => 
'tablelocation/data')
 ```
 
+### `rewrite_data_files`
+
+Iceberg tracks each data file in a table. More data files leads to more 
metadata stored in manifest files, and small data files causes an unnecessary 
amount of metadata and less efficient queries from file open costs.
+
+Iceberg can compact data files in parallel using Spark with the 
`rewriteDataFiles` action. This will combine small files into larger files to 
reduce metadata overhead and runtime file open cost.
+
+#### Usage
+
+| Argument Name | Required? | Type | Description |
+|---------------|-----------|------|-------------|
+| `table`       | ✔️  | string | Name of the table to update |
+| `strategy`    |    | string | Name of the strategy - binpack or sort |
+| `options`     | ️   | map<string, string> | Options to be used for actions. 
Supported options are target-file-size-bytes, partial-progress.enabled, 
partial-progress.max-commits, max-file-group-size-bytes, 
max-concurrent-file-group-rewrites, min-input-files, min-file-size-bytes, 
max-file-size-bytes|
+
+#### Output
+
+| Output Name | Type | Description |
+| ------------|------|-------------|
+| `rewritten_data_files_count` | int | Number of data which were re-written by 
this command |
+| `rewritten_data_files_count`     | int | Number of new data files which were 
written by this command |

Review comment:
       Typo in output name




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to