yihua commented on code in PR #9756:
URL: https://github.com/apache/hudi/pull/9756#discussion_r1332509504
##########
website/docs/basic_configurations.md:
##########
@@ -260,12 +262,13 @@ Configurations that control write behavior on Hudi
tables. These can be directly
[**Basic Configs**](#Write-Configurations-basic-configs)
-| Config Name
| Default | Description
|
-|
---------------------------------------------------------------------------------
| ------------------------ |
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|
-| [hoodie.base.path](#hoodiebasepath)
| N/A **(Required)** | Base path on lake storage, under which all
the table data is stored. Always prefix it explicitly with the storage scheme
(e.g hdfs://, s3:// etc). Hudi stores all the main meta-data about commits,
savepoints, cleaning audit logs etc in .hoodie directory under this base path
directory.<br /><br />`Config Param: BASE_PATH`
|
-| [hoodie.table.name](#hoodietablename)
| N/A **(Required)** | Table name that will be used for registering
with metastores like HMS. Needs to be same across runs.<br /><br />`Config
Param: TBL_NAME`
|
-|
[hoodie.datasource.write.precombine.field](#hoodiedatasourcewriteprecombinefield)
| ts (Optional) | Field used in preCombining before actual write.
When two records have the same key value, we will pick the one with the largest
value for the precombine field, determined by Object.compareTo(..)<br /><br
/>`Config Param: PRECOMBINE_FIELD_NAME`
|
-| [hoodie.write.concurrency.mode](#hoodiewriteconcurrencymode)
| SINGLE_WRITER (Optional) |
org.apache.hudi.common.model.WriteConcurrencyMode: Concurrency modes for write
operations. SINGLE_WRITER(default): Only one active writer to the table.
Maximizes throughput. OPTIMISTIC_CONCURRENCY_CONTROL: Multiple writers can
operate on the table with lazy conflict resolution using locks. This means that
only one writer succeeds if multiple writers write to the same file group.<br
/><br />`Config Param: WRITE_CONCURRENCY_MODE` |
+| Config Name
| Default | Description
|
+|
-----------------------------------------------------------------------------------------
| ------------------------ |
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|
+| [hoodie.base.path](#hoodiebasepath)
| N/A **(Required)** | Base path on lake storage, under
which all the table data is stored. Always prefix it explicitly with the
storage scheme (e.g hdfs://, s3:// etc). Hudi stores all the main meta-data
about commits, savepoints, cleaning audit logs etc in .hoodie directory under
this base path directory.<br />`Config Param: BASE_PATH`
|
+| [hoodie.table.name](#hoodietablename)
| N/A **(Required)** | Table name that will be used for
registering with metastores like HMS. Needs to be same across runs.<br
/>`Config Param: TBL_NAME`
|
+|
[hoodie.datasource.write.precombine.field](#hoodiedatasourcewriteprecombinefield)
| ts (Optional) | Field used in preCombining before actual
write. When two records have the same key value, we will pick the one with the
largest value for the precombine field, determined by Object.compareTo(..)<br
/>`Config Param: PRECOMBINE_FIELD_NAME`
|
+| [hoodie.write.concurrency.mode](#hoodiewriteconcurrencymode)
| SINGLE_WRITER (Optional) |
org.apache.hudi.common.model.WriteConcurrencyMode: Concurrency modes for write
operations.<ul> <li>`SINGLE_WRITER`(default): Only one active writer to the
table. Maximizes throughput.</li> <li>`OPTIMISTIC_CONCURRENCY_CONTROL`:
Multiple writers can operate on the table with lazy conflict resolution using
locks. This means that only one writer succeeds if multiple writers write to
the same file group.</li></ul><br />`Config Param: WRITE_CONCURRENCY_MODE` |
+|
[hoodie.write.num.retries.on.conflict.failures](#hoodiewritenumretriesonconflictfailures)
| 0 (Optional) | Maximum number of times to retry a batch on
conflict failure.<br />`Config Param: NUM_RETRIES_ON_CONFLICT_FAILURES`<br
/>`Since Version: 0.13.0`
|
Review Comment:
This is an advanced config.
##########
website/docs/basic_configurations.md:
##########
@@ -146,11 +146,13 @@ Configurations used by the Hudi Metadata Table. This
table maintains the metadat
[**Basic Configs**](#Metadata-Configs-basic-configs)
-| Config Name
| Default | Description
|
-|
----------------------------------------------------------------------------------
| ---------------- |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|
-| [hoodie.metadata.enable](#hoodiemetadataenable)
| true (Optional) | Enable the internal metadata table which serves
table metadata like level file listings<br /><br />`Config Param: ENABLE`<br
/>`Since Version: 0.7.0`
|
-|
[hoodie.metadata.index.bloom.filter.enable](#hoodiemetadataindexbloomfilterenable)
| false (Optional) | Enable indexing bloom filters of user data files under
metadata table. When enabled, metadata table will have a partition to store the
bloom filter index and will be used during the index lookups.<br /><br
/>`Config Param: ENABLE_METADATA_INDEX_BLOOM_FILTER`<br />`Since Version:
0.11.0` |
-|
[hoodie.metadata.index.column.stats.enable](#hoodiemetadataindexcolumnstatsenable)
| false (Optional) | Enable indexing column ranges of user data files under
metadata table key lookups. When enabled, metadata table will have a partition
to store the column ranges and will be used for pruning files during the index
lookups.<br /><br />`Config Param: ENABLE_METADATA_INDEX_COLUMN_STATS`<br
/>`Since Version: 0.11.0` |
+| Config Name
| Default | Description
|
+|
----------------------------------------------------------------------------------
| --------------------- |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|
+| [hoodie.metadata.enable](#hoodiemetadataenable)
| true (Optional) | Enable the internal metadata table which serves
table metadata like level file listings<br />`Config Param: ENABLE`<br />`Since
Version: 0.7.0`
|
+|
[hoodie.metadata.index.bloom.filter.enable](#hoodiemetadataindexbloomfilterenable)
| false (Optional) | Enable indexing bloom filters of user data files
under metadata table. When enabled, metadata table will have a partition to
store the bloom filter index and will be used during the index lookups.<br
/>`Config Param: ENABLE_METADATA_INDEX_BLOOM_FILTER`<br />`Since Version:
0.11.0` |
+|
[hoodie.metadata.index.column.stats.enable](#hoodiemetadataindexcolumnstatsenable)
| false (Optional) | Enable indexing column ranges of user data files
under metadata table key lookups. When enabled, metadata table will have a
partition to store the column ranges and will be used for pruning files during
the index lookups.<br />`Config Param: ENABLE_METADATA_INDEX_COLUMN_STATS`<br
/>`Since Version: 0.11.0` |
+| [hoodie.metadata.max.init.parallelism](#hoodiemetadatamaxinitparallelism)
| 100000 (Optional) | Maximum parallelism to use when initializing
Record Index.<br />`Config Param: RECORD_INDEX_MAX_PARALLELISM`<br />`Since
Version: 0.14.0`
|
+| [hoodie.metadata.max.logfile.size](#hoodiemetadatamaxlogfilesize)
| 2147483648 (Optional) | Maximum size in bytes of a single log file.
Larger log files can contain larger log blocks thereby reducing the number of
blocks to search for keys<br />`Config Param: MAX_LOG_FILE_SIZE_BYTES_PROP`<br
/>`Since Version: 0.14.0`
|
Review Comment:
These two are advanced configs.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]