(hudi) branch asf-site updated: [HUDI-6854][DOCS] Change default payload type to HOODIE_AVRO_DEFAULT (#11551)

danny0405 Mon, 01 Jul 2024 17:28:50 -0700

This is an automated email from the ASF dual-hosted git repository.

danny0405 pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/hudi.git



The following commit(s) were added to refs/heads/asf-site by this push:
     new 98f2ca487bd [HUDI-6854][DOCS] Change default payload type to 
HOODIE_AVRO_DEFAULT (#11551)
98f2ca487bd is described below

commit 98f2ca487bd53eb4edd05187a9a7a7d58140db79
Author: Vova Kolmakov <[email protected]>
AuthorDate: Tue Jul 2 07:28:38 2024 +0700

    [HUDI-6854][DOCS] Change default payload type to HOODIE_AVRO_DEFAULT 
(#11551)
---
 website/docs/basic_configurations.md |  63 ++++++-------
 website/docs/configurations.md       | 173 ++++++++++++++++++-----------------
 2 files changed, 120 insertions(+), 116 deletions(-)

diff --git a/website/docs/basic_configurations.md 
b/website/docs/basic_configurations.md
index 9d579738ca9..08d5ac717d9 100644
--- a/website/docs/basic_configurations.md
+++ b/website/docs/basic_configurations.md
@@ -1,7 +1,7 @@
 ---
 title: Basic Configurations
 summary: This page covers the basic configurations you may use to write/read 
Hudi tables. This page only features a subset of the most frequently used 
configurations. For a full list of all configs, please visit the [All 
Configurations](/docs/configurations) page.
-last_modified_at: 2024-06-06T12:59:56.064
+last_modified_at: 2024-07-01T15:09:57.63
 ---
 
 
@@ -33,36 +33,37 @@ Configurations of the Hudi Table like type of ingestion, 
storage formats, hive t
 [**Basic Configs**](#Hudi-Table-Basic-Configs-basic-configs)
 
 
-| Config Name                                                                  
                    | Default                                                   
        | Description                                                           
                                                                                
                                                                                
                                                                                
              [...]
-| 
------------------------------------------------------------------------------------------------
 | ----------------------------------------------------------------- | 
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 [...]
-| [hoodie.bootstrap.base.path](#hoodiebootstrapbasepath)                       
                    | (N/A)                                                     
        | Base path of the dataset that needs to be bootstrapped as a Hudi 
table<br />`Config Param: BOOTSTRAP_BASE_PATH`                                  
                                                                                
                                                                                
                   [...]
-| [hoodie.database.name](#hoodiedatabasename)                                  
                    | (N/A)                                                     
        | Database name that will be used for incremental query.If different 
databases have the same table name during incremental query, we can set it to 
limit the table name under a specific database<br />`Config Param: 
DATABASE_NAME`                                                                  
                                [...]
-| [hoodie.table.checksum](#hoodietablechecksum)                                
                    | (N/A)                                                     
        | Table checksum is used to guard against partial writes in HDFS. It is 
added as the last entry in hoodie.properties and then used to validate while 
reading table config.<br />`Config Param: TABLE_CHECKSUM`<br />`Since Version: 
0.11.0`                                                                         
                  [...]
-| [hoodie.table.create.schema](#hoodietablecreateschema)                       
                    | (N/A)                                                     
        | Schema used when creating the table, for the first time.<br />`Config 
Param: CREATE_SCHEMA`                                                           
                                                                                
                                                                                
              [...]
-| [hoodie.table.keygenerator.class](#hoodietablekeygeneratorclass)             
                    | (N/A)                                                     
        | Key Generator class property for the hoodie table<br />`Config Param: 
KEY_GENERATOR_CLASS_NAME`                                                       
                                                                                
                                                                                
              [...]
-| [hoodie.table.metadata.partitions](#hoodietablemetadatapartitions)           
                    | (N/A)                                                     
        | Comma-separated list of metadata partitions that have been completely 
built and in-sync with data table. These partitions are ready for use by the 
readers<br />`Config Param: TABLE_METADATA_PARTITIONS`<br />`Since Version: 
0.11.0`                                                                         
                     [...]
-| 
[hoodie.table.metadata.partitions.inflight](#hoodietablemetadatapartitionsinflight)
              | (N/A)                                                           
  | Comma-separated list of metadata partitions whose building is in progress. 
These partitions are not yet ready for use by the readers.<br />`Config Param: 
TABLE_METADATA_PARTITIONS_INFLIGHT`<br />`Since Version: 0.11.0`                
                                                                                
          [...]
-| [hoodie.table.name](#hoodietablename)                                        
                    | (N/A)                                                     
        | Table name that will be used for registering with Hive. Needs to be 
same across runs.<br />`Config Param: NAME`                                     
                                                                                
                                                                                
                [...]
-| [hoodie.table.partition.fields](#hoodietablepartitionfields)                 
                    | (N/A)                                                     
        | Fields used to partition the table. Concatenated values of these 
fields are used as the partition path, by invoking toString()<br />`Config 
Param: PARTITION_FIELDS`                                                        
                                                                                
                        [...]
-| [hoodie.table.precombine.field](#hoodietableprecombinefield)                 
                    | (N/A)                                                     
        | Field used in preCombining before actual write. By default, when two 
records have the same key value, the largest value for the precombine field 
determined by Object.compareTo(..), is picked.<br />`Config Param: 
PRECOMBINE_FIELD`                                                               
                                [...]
-| [hoodie.table.recordkey.fields](#hoodietablerecordkeyfields)                 
                    | (N/A)                                                     
        | Columns used to uniquely identify the table. Concatenated values of 
these fields are used as  the record key component of HoodieKey.<br />`Config 
Param: RECORDKEY_FIELDS`                                                        
                                                                                
                  [...]
-| 
[hoodie.table.secondary.indexes.metadata](#hoodietablesecondaryindexesmetadata) 
                 | (N/A)                                                        
     | The metadata of secondary indexes<br />`Config Param: 
SECONDARY_INDEXES_METADATA`<br />`Since Version: 0.13.0`                        
                                                                                
                                                                                
                              [...]
-| [hoodie.timeline.layout.version](#hoodietimelinelayoutversion)               
                    | (N/A)                                                     
        | Version of timeline used, by the table.<br />`Config Param: 
TIMELINE_LAYOUT_VERSION`                                                        
                                                                                
                                                                                
                        [...]
-| [hoodie.archivelog.folder](#hoodiearchivelogfolder)                          
                    | archived                                                  
        | path under the meta folder, to store archived timeline instants 
at.<br />`Config Param: ARCHIVELOG_FOLDER`                                      
                                                                                
                                                                                
                    [...]
-| [hoodie.bootstrap.index.class](#hoodiebootstrapindexclass)                   
                    | 
org.apache.hudi.common.bootstrap.index.hfile.HFileBootstrapIndex  | 
Implementation to use, for mapping base files to bootstrap base file, that 
contain actual data.<br />`Config Param: BOOTSTRAP_INDEX_CLASS_NAME`            
                                                                                
                                                                                
         [...]
-| [hoodie.bootstrap.index.enable](#hoodiebootstrapindexenable)                 
                    | true                                                      
        | Whether or not, this is a bootstrapped table, with bootstrap base 
data and an mapping index defined, default true.<br />`Config Param: 
BOOTSTRAP_INDEX_ENABLE`                                                         
                                                                                
                             [...]
-| [hoodie.compaction.payload.class](#hoodiecompactionpayloadclass)             
                    | 
org.apache.hudi.common.model.OverwriteWithLatestAvroPayload       | Payload 
class to use for performing compactions, i.e merge delta logs with current base 
file and then  produce a new base file.<br />`Config Param: PAYLOAD_CLASS_NAME` 
                                                                                
                                                                            
[...]
-| 
[hoodie.compaction.record.merger.strategy](#hoodiecompactionrecordmergerstrategy)
                | eeb8d96f-b1e4-49fd-bbf8-28ac514178e5                          
    | Id of merger strategy. Hudi will pick HoodieRecordMerger implementations 
in hoodie.datasource.write.record.merger.impls which has the same merger 
strategy id<br />`Config Param: RECORD_MERGER_STRATEGY`<br />`Since Version: 
0.13.0`                                                                         
                     [...]
-| 
[hoodie.datasource.write.hive_style_partitioning](#hoodiedatasourcewritehive_style_partitioning)
 | false                                                             | Flag to 
indicate whether to use Hive style partitioning. If set true, the names of 
partition folders follow &lt;partition_column_name&gt;=&lt;partition_value&gt; 
format. By default false (the names of partition folders are only partition 
values)<br />`Config Param: HIVE_STYLE_PARTITIONING_ENABLE`                     
      [...]
-| 
[hoodie.partition.metafile.use.base.format](#hoodiepartitionmetafileusebaseformat)
               | false                                                          
   | If true, partition metafiles are saved in the same format as base-files 
for this dataset (e.g. Parquet / ORC). If false (default) partition metafiles 
are saved as properties files.<br />`Config Param: 
PARTITION_METAFILE_USE_BASE_FORMAT`                                             
                                           [...]
-| [hoodie.populate.meta.fields](#hoodiepopulatemetafields)                     
                    | true                                                      
        | When enabled, populates all meta fields. When disabled, no meta 
fields are populated and incremental queries will not be functional. This is 
only meant to be used for append only/immutable data for batch processing<br 
/>`Config Param: POPULATE_META_FIELDS`                                          
                          [...]
-| [hoodie.table.base.file.format](#hoodietablebasefileformat)                  
                    | PARQUET                                                   
        | Base file format to store all the base file data.<br />`Config Param: 
BASE_FILE_FORMAT`                                                               
                                                                                
                                                                                
              [...]
-| [hoodie.table.cdc.enabled](#hoodietablecdcenabled)                           
                    | false                                                     
        | When enable, persist the change data if necessary, and can be queried 
as a CDC query mode.<br />`Config Param: CDC_ENABLED`<br />`Since Version: 
0.13.0`                                                                         
                                                                                
                   [...]
-| 
[hoodie.table.cdc.supplemental.logging.mode](#hoodietablecdcsupplementalloggingmode)
             | DATA_BEFORE_AFTER                                                
 | org.apache.hudi.common.table.cdc.HoodieCDCSupplementalLoggingMode: Change 
log capture supplemental logging mode. The supplemental log is used for 
accelerating the generation of change log details.     OP_KEY_ONLY: Only 
keeping record keys in the supplemental logs, so the reader needs to figure out 
the update before image  [...]
-| [hoodie.table.log.file.format](#hoodietablelogfileformat)                    
                    | HOODIE_LOG                                                
        | Log format used for the delta logs.<br />`Config Param: 
LOG_FILE_FORMAT`                                                                
                                                                                
                                                                                
                            [...]
-| [hoodie.table.timeline.timezone](#hoodietabletimelinetimezone)               
                    | LOCAL                                                     
        | User can set hoodie commit timeline timezone, such as utc, local and 
so on. local is default<br />`Config Param: TIMELINE_TIMEZONE`                  
                                                                                
                                                                                
               [...]
-| [hoodie.table.type](#hoodietabletype)                                        
                    | COPY_ON_WRITE                                             
        | The table type for the underlying data, for this write. This can’t 
change between writes.<br />`Config Param: TYPE`                                
                                                                                
                                                                                
                 [...]
-| [hoodie.table.version](#hoodietableversion)                                  
                    | ZERO                                                      
        | Version of table, used for running upgrade/downgrade steps between 
releases with potentially breaking/backwards compatible changes.<br />`Config 
Param: VERSION`                                                                 
                                                                                
                   [...]
+| Config Name                                                                  
                    | Default                                                   
        | Description                                                           
                                                                                
                                                                                
                                                                                
              [...]
+| 
------------------------------------------------------------------------------------------------
 | ----------------------------------------------------------------- | 
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 [...]
+| [hoodie.bootstrap.base.path](#hoodiebootstrapbasepath)                       
                    | (N/A)                                                     
        | Base path of the dataset that needs to be bootstrapped as a Hudi 
table<br />`Config Param: BOOTSTRAP_BASE_PATH`                                  
                                                                                
                                                                                
                   [...]
+| [hoodie.database.name](#hoodiedatabasename)                                  
                    | (N/A)                                                     
        | Database name that will be used for incremental query.If different 
databases have the same table name during incremental query, we can set it to 
limit the table name under a specific database<br />`Config Param: 
DATABASE_NAME`                                                                  
                                [...]
+| [hoodie.table.checksum](#hoodietablechecksum)                                
                    | (N/A)                                                     
        | Table checksum is used to guard against partial writes in HDFS. It is 
added as the last entry in hoodie.properties and then used to validate while 
reading table config.<br />`Config Param: TABLE_CHECKSUM`<br />`Since Version: 
0.11.0`                                                                         
                  [...]
+| [hoodie.table.create.schema](#hoodietablecreateschema)                       
                    | (N/A)                                                     
        | Schema used when creating the table, for the first time.<br />`Config 
Param: CREATE_SCHEMA`                                                           
                                                                                
                                                                                
              [...]
+| [hoodie.table.keygenerator.class](#hoodietablekeygeneratorclass)             
                    | (N/A)                                                     
        | Key Generator class property for the hoodie table<br />`Config Param: 
KEY_GENERATOR_CLASS_NAME`                                                       
                                                                                
                                                                                
              [...]
+| [hoodie.table.metadata.partitions](#hoodietablemetadatapartitions)           
                    | (N/A)                                                     
        | Comma-separated list of metadata partitions that have been completely 
built and in-sync with data table. These partitions are ready for use by the 
readers<br />`Config Param: TABLE_METADATA_PARTITIONS`<br />`Since Version: 
0.11.0`                                                                         
                     [...]
+| 
[hoodie.table.metadata.partitions.inflight](#hoodietablemetadatapartitionsinflight)
              | (N/A)                                                           
  | Comma-separated list of metadata partitions whose building is in progress. 
These partitions are not yet ready for use by the readers.<br />`Config Param: 
TABLE_METADATA_PARTITIONS_INFLIGHT`<br />`Since Version: 0.11.0`                
                                                                                
          [...]
+| [hoodie.table.name](#hoodietablename)                                        
                    | (N/A)                                                     
        | Table name that will be used for registering with Hive. Needs to be 
same across runs.<br />`Config Param: NAME`                                     
                                                                                
                                                                                
                [...]
+| [hoodie.table.partition.fields](#hoodietablepartitionfields)                 
                    | (N/A)                                                     
        | Fields used to partition the table. Concatenated values of these 
fields are used as the partition path, by invoking toString()<br />`Config 
Param: PARTITION_FIELDS`                                                        
                                                                                
                        [...]
+| [hoodie.table.precombine.field](#hoodietableprecombinefield)                 
                    | (N/A)                                                     
        | Field used in preCombining before actual write. By default, when two 
records have the same key value, the largest value for the precombine field 
determined by Object.compareTo(..), is picked.<br />`Config Param: 
PRECOMBINE_FIELD`                                                               
                                [...]
+| [hoodie.table.recordkey.fields](#hoodietablerecordkeyfields)                 
                    | (N/A)                                                     
        | Columns used to uniquely identify the table. Concatenated values of 
these fields are used as  the record key component of HoodieKey.<br />`Config 
Param: RECORDKEY_FIELDS`                                                        
                                                                                
                  [...]
+| 
[hoodie.table.secondary.indexes.metadata](#hoodietablesecondaryindexesmetadata) 
                 | (N/A)                                                        
     | The metadata of secondary indexes<br />`Config Param: 
SECONDARY_INDEXES_METADATA`<br />`Since Version: 0.13.0`                        
                                                                                
                                                                                
                              [...]
+| [hoodie.timeline.layout.version](#hoodietimelinelayoutversion)               
                    | (N/A)                                                     
        | Version of timeline used, by the table.<br />`Config Param: 
TIMELINE_LAYOUT_VERSION`                                                        
                                                                                
                                                                                
                        [...]
+| [hoodie.archivelog.folder](#hoodiearchivelogfolder)                          
                    | archived                                                  
        | path under the meta folder, to store archived timeline instants 
at.<br />`Config Param: ARCHIVELOG_FOLDER`                                      
                                                                                
                                                                                
                    [...]
+| [hoodie.bootstrap.index.class](#hoodiebootstrapindexclass)                   
                    | 
org.apache.hudi.common.bootstrap.index.hfile.HFileBootstrapIndex  | 
Implementation to use, for mapping base files to bootstrap base file, that 
contain actual data.<br />`Config Param: BOOTSTRAP_INDEX_CLASS_NAME`            
                                                                                
                                                                                
         [...]
+| [hoodie.bootstrap.index.enable](#hoodiebootstrapindexenable)                 
                    | true                                                      
        | Whether or not, this is a bootstrapped table, with bootstrap base 
data and an mapping index defined, default true.<br />`Config Param: 
BOOTSTRAP_INDEX_ENABLE`                                                         
                                                                                
                             [...]
+| [hoodie.compaction.payload.class](#hoodiecompactionpayloadclass)             
                    | org.apache.hudi.common.model.DefaultHoodieRecordPayload   
        | Payload class to use for performing compactions, i.e merge delta logs 
with current base file and then  produce a new base file.<br />`Config Param: 
PAYLOAD_CLASS_NAME`                                                             
                                                                                
                [...]
+| [hoodie.compaction.payload.type](#hoodiecompactionpayloadtype)               
                    | HOODIE_AVRO_DEFAULT                                       
        | org.apache.hudi.common.model.RecordPayloadType: Payload to use for 
merging records     AWS_DMS_AVRO: Provides support for seamlessly applying 
changes captured via Amazon Database Migration Service onto S3.     
HOODIE_AVRO: A payload to wrap a existing Hoodie Avro Record. Useful to create 
a HoodieRecord over existing Gener [...]
+| 
[hoodie.compaction.record.merger.strategy](#hoodiecompactionrecordmergerstrategy)
                | eeb8d96f-b1e4-49fd-bbf8-28ac514178e5                          
    | Id of merger strategy. Hudi will pick HoodieRecordMerger implementations 
in hoodie.datasource.write.record.merger.impls which has the same merger 
strategy id<br />`Config Param: RECORD_MERGER_STRATEGY`<br />`Since Version: 
0.13.0`                                                                         
                     [...]
+| 
[hoodie.datasource.write.hive_style_partitioning](#hoodiedatasourcewritehive_style_partitioning)
 | false                                                             | Flag to 
indicate whether to use Hive style partitioning. If set true, the names of 
partition folders follow &lt;partition_column_name&gt;=&lt;partition_value&gt; 
format. By default false (the names of partition folders are only partition 
values)<br />`Config Param: HIVE_STYLE_PARTITIONING_ENABLE`                     
      [...]
+| 
[hoodie.partition.metafile.use.base.format](#hoodiepartitionmetafileusebaseformat)
               | false                                                          
   | If true, partition metafiles are saved in the same format as base-files 
for this dataset (e.g. Parquet / ORC). If false (default) partition metafiles 
are saved as properties files.<br />`Config Param: 
PARTITION_METAFILE_USE_BASE_FORMAT`                                             
                                           [...]
+| [hoodie.populate.meta.fields](#hoodiepopulatemetafields)                     
                    | true                                                      
        | When enabled, populates all meta fields. When disabled, no meta 
fields are populated and incremental queries will not be functional. This is 
only meant to be used for append only/immutable data for batch processing<br 
/>`Config Param: POPULATE_META_FIELDS`                                          
                          [...]
+| [hoodie.table.base.file.format](#hoodietablebasefileformat)                  
                    | PARQUET                                                   
        | Base file format to store all the base file data.<br />`Config Param: 
BASE_FILE_FORMAT`                                                               
                                                                                
                                                                                
              [...]
+| [hoodie.table.cdc.enabled](#hoodietablecdcenabled)                           
                    | false                                                     
        | When enable, persist the change data if necessary, and can be queried 
as a CDC query mode.<br />`Config Param: CDC_ENABLED`<br />`Since Version: 
0.13.0`                                                                         
                                                                                
                   [...]
+| 
[hoodie.table.cdc.supplemental.logging.mode](#hoodietablecdcsupplementalloggingmode)
             | DATA_BEFORE_AFTER                                                
 | org.apache.hudi.common.table.cdc.HoodieCDCSupplementalLoggingMode: Change 
log capture supplemental logging mode. The supplemental log is used for 
accelerating the generation of change log details.     OP_KEY_ONLY: Only 
keeping record keys in the supplemental logs, so the reader needs to figure out 
the update before image  [...]
+| [hoodie.table.log.file.format](#hoodietablelogfileformat)                    
                    | HOODIE_LOG                                                
        | Log format used for the delta logs.<br />`Config Param: 
LOG_FILE_FORMAT`                                                                
                                                                                
                                                                                
                            [...]
+| [hoodie.table.timeline.timezone](#hoodietabletimelinetimezone)               
                    | LOCAL                                                     
        | User can set hoodie commit timeline timezone, such as utc, local and 
so on. local is default<br />`Config Param: TIMELINE_TIMEZONE`                  
                                                                                
                                                                                
               [...]
+| [hoodie.table.type](#hoodietabletype)                                        
                    | COPY_ON_WRITE                                             
        | The table type for the underlying data, for this write. This can’t 
change between writes.<br />`Config Param: TYPE`                                
                                                                                
                                                                                
                 [...]
+| [hoodie.table.version](#hoodietableversion)                                  
                    | ZERO                                                      
        | Version of table, used for running upgrade/downgrade steps between 
releases with potentially breaking/backwards compatible changes.<br />`Config 
Param: VERSION`                                                                 
                                                                                
                   [...]
 ---
 
 ## Spark Datasource Configs {#SPARK_DATASOURCE}
diff --git a/website/docs/configurations.md b/website/docs/configurations.md
index 278be1f5afa..a6814502cdc 100644
--- a/website/docs/configurations.md
+++ b/website/docs/configurations.md
@@ -5,7 +5,7 @@ permalink: /docs/configurations.html
 summary: This page covers the different ways of configuring your job to 
write/read Hudi tables. At a high level, you can control behaviour at few 
levels.
 toc_min_heading_level: 2
 toc_max_heading_level: 4
-last_modified_at: 2024-06-06T12:59:56.026
+last_modified_at: 2024-07-01T15:09:57.588
 ---
 
 
@@ -54,36 +54,37 @@ Configurations of the Hudi Table like type of ingestion, 
storage formats, hive t
 [**Basic Configs**](#Hudi-Table-Basic-Configs-basic-configs)
 
 
-| Config Name                                                                  
                    | Default                                                   
        | Description                                                           
                                                                                
                                                                                
                                                                                
              [...]
-| 
------------------------------------------------------------------------------------------------
 | ----------------------------------------------------------------- | 
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 [...]
-| [hoodie.bootstrap.base.path](#hoodiebootstrapbasepath)                       
                    | (N/A)                                                     
        | Base path of the dataset that needs to be bootstrapped as a Hudi 
table<br />`Config Param: BOOTSTRAP_BASE_PATH`                                  
                                                                                
                                                                                
                   [...]
-| [hoodie.database.name](#hoodiedatabasename)                                  
                    | (N/A)                                                     
        | Database name that will be used for incremental query.If different 
databases have the same table name during incremental query, we can set it to 
limit the table name under a specific database<br />`Config Param: 
DATABASE_NAME`                                                                  
                                [...]
-| [hoodie.table.checksum](#hoodietablechecksum)                                
                    | (N/A)                                                     
        | Table checksum is used to guard against partial writes in HDFS. It is 
added as the last entry in hoodie.properties and then used to validate while 
reading table config.<br />`Config Param: TABLE_CHECKSUM`<br />`Since Version: 
0.11.0`                                                                         
                  [...]
-| [hoodie.table.create.schema](#hoodietablecreateschema)                       
                    | (N/A)                                                     
        | Schema used when creating the table, for the first time.<br />`Config 
Param: CREATE_SCHEMA`                                                           
                                                                                
                                                                                
              [...]
-| [hoodie.table.keygenerator.class](#hoodietablekeygeneratorclass)             
                    | (N/A)                                                     
        | Key Generator class property for the hoodie table<br />`Config Param: 
KEY_GENERATOR_CLASS_NAME`                                                       
                                                                                
                                                                                
              [...]
-| [hoodie.table.metadata.partitions](#hoodietablemetadatapartitions)           
                    | (N/A)                                                     
        | Comma-separated list of metadata partitions that have been completely 
built and in-sync with data table. These partitions are ready for use by the 
readers<br />`Config Param: TABLE_METADATA_PARTITIONS`<br />`Since Version: 
0.11.0`                                                                         
                     [...]
-| 
[hoodie.table.metadata.partitions.inflight](#hoodietablemetadatapartitionsinflight)
              | (N/A)                                                           
  | Comma-separated list of metadata partitions whose building is in progress. 
These partitions are not yet ready for use by the readers.<br />`Config Param: 
TABLE_METADATA_PARTITIONS_INFLIGHT`<br />`Since Version: 0.11.0`                
                                                                                
          [...]
-| [hoodie.table.name](#hoodietablename)                                        
                    | (N/A)                                                     
        | Table name that will be used for registering with Hive. Needs to be 
same across runs.<br />`Config Param: NAME`                                     
                                                                                
                                                                                
                [...]
-| [hoodie.table.partition.fields](#hoodietablepartitionfields)                 
                    | (N/A)                                                     
        | Fields used to partition the table. Concatenated values of these 
fields are used as the partition path, by invoking toString()<br />`Config 
Param: PARTITION_FIELDS`                                                        
                                                                                
                        [...]
-| [hoodie.table.precombine.field](#hoodietableprecombinefield)                 
                    | (N/A)                                                     
        | Field used in preCombining before actual write. By default, when two 
records have the same key value, the largest value for the precombine field 
determined by Object.compareTo(..), is picked.<br />`Config Param: 
PRECOMBINE_FIELD`                                                               
                                [...]
-| [hoodie.table.recordkey.fields](#hoodietablerecordkeyfields)                 
                    | (N/A)                                                     
        | Columns used to uniquely identify the table. Concatenated values of 
these fields are used as  the record key component of HoodieKey.<br />`Config 
Param: RECORDKEY_FIELDS`                                                        
                                                                                
                  [...]
-| 
[hoodie.table.secondary.indexes.metadata](#hoodietablesecondaryindexesmetadata) 
                 | (N/A)                                                        
     | The metadata of secondary indexes<br />`Config Param: 
SECONDARY_INDEXES_METADATA`<br />`Since Version: 0.13.0`                        
                                                                                
                                                                                
                              [...]
-| [hoodie.timeline.layout.version](#hoodietimelinelayoutversion)               
                    | (N/A)                                                     
        | Version of timeline used, by the table.<br />`Config Param: 
TIMELINE_LAYOUT_VERSION`                                                        
                                                                                
                                                                                
                        [...]
-| [hoodie.archivelog.folder](#hoodiearchivelogfolder)                          
                    | archived                                                  
        | path under the meta folder, to store archived timeline instants 
at.<br />`Config Param: ARCHIVELOG_FOLDER`                                      
                                                                                
                                                                                
                    [...]
-| [hoodie.bootstrap.index.class](#hoodiebootstrapindexclass)                   
                    | 
org.apache.hudi.common.bootstrap.index.hfile.HFileBootstrapIndex  | 
Implementation to use, for mapping base files to bootstrap base file, that 
contain actual data.<br />`Config Param: BOOTSTRAP_INDEX_CLASS_NAME`            
                                                                                
                                                                                
         [...]
-| [hoodie.bootstrap.index.enable](#hoodiebootstrapindexenable)                 
                    | true                                                      
        | Whether or not, this is a bootstrapped table, with bootstrap base 
data and an mapping index defined, default true.<br />`Config Param: 
BOOTSTRAP_INDEX_ENABLE`                                                         
                                                                                
                             [...]
-| [hoodie.compaction.payload.class](#hoodiecompactionpayloadclass)             
                    | 
org.apache.hudi.common.model.OverwriteWithLatestAvroPayload       | Payload 
class to use for performing compactions, i.e merge delta logs with current base 
file and then  produce a new base file.<br />`Config Param: PAYLOAD_CLASS_NAME` 
                                                                                
                                                                            
[...]
-| 
[hoodie.compaction.record.merger.strategy](#hoodiecompactionrecordmergerstrategy)
                | eeb8d96f-b1e4-49fd-bbf8-28ac514178e5                          
    | Id of merger strategy. Hudi will pick HoodieRecordMerger implementations 
in hoodie.datasource.write.record.merger.impls which has the same merger 
strategy id<br />`Config Param: RECORD_MERGER_STRATEGY`<br />`Since Version: 
0.13.0`                                                                         
                     [...]
-| 
[hoodie.datasource.write.hive_style_partitioning](#hoodiedatasourcewritehive_style_partitioning)
 | false                                                             | Flag to 
indicate whether to use Hive style partitioning. If set true, the names of 
partition folders follow &lt;partition_column_name&gt;=&lt;partition_value&gt; 
format. By default false (the names of partition folders are only partition 
values)<br />`Config Param: HIVE_STYLE_PARTITIONING_ENABLE`                     
      [...]
-| 
[hoodie.partition.metafile.use.base.format](#hoodiepartitionmetafileusebaseformat)
               | false                                                          
   | If true, partition metafiles are saved in the same format as base-files 
for this dataset (e.g. Parquet / ORC). If false (default) partition metafiles 
are saved as properties files.<br />`Config Param: 
PARTITION_METAFILE_USE_BASE_FORMAT`                                             
                                           [...]
-| [hoodie.populate.meta.fields](#hoodiepopulatemetafields)                     
                    | true                                                      
        | When enabled, populates all meta fields. When disabled, no meta 
fields are populated and incremental queries will not be functional. This is 
only meant to be used for append only/immutable data for batch processing<br 
/>`Config Param: POPULATE_META_FIELDS`                                          
                          [...]
-| [hoodie.table.base.file.format](#hoodietablebasefileformat)                  
                    | PARQUET                                                   
        | Base file format to store all the base file data.<br />`Config Param: 
BASE_FILE_FORMAT`                                                               
                                                                                
                                                                                
              [...]
-| [hoodie.table.cdc.enabled](#hoodietablecdcenabled)                           
                    | false                                                     
        | When enable, persist the change data if necessary, and can be queried 
as a CDC query mode.<br />`Config Param: CDC_ENABLED`<br />`Since Version: 
0.13.0`                                                                         
                                                                                
                   [...]
-| 
[hoodie.table.cdc.supplemental.logging.mode](#hoodietablecdcsupplementalloggingmode)
             | DATA_BEFORE_AFTER                                                
 | org.apache.hudi.common.table.cdc.HoodieCDCSupplementalLoggingMode: Change 
log capture supplemental logging mode. The supplemental log is used for 
accelerating the generation of change log details.     OP_KEY_ONLY: Only 
keeping record keys in the supplemental logs, so the reader needs to figure out 
the update before image  [...]
-| [hoodie.table.log.file.format](#hoodietablelogfileformat)                    
                    | HOODIE_LOG                                                
        | Log format used for the delta logs.<br />`Config Param: 
LOG_FILE_FORMAT`                                                                
                                                                                
                                                                                
                            [...]
-| [hoodie.table.timeline.timezone](#hoodietabletimelinetimezone)               
                    | LOCAL                                                     
        | User can set hoodie commit timeline timezone, such as utc, local and 
so on. local is default<br />`Config Param: TIMELINE_TIMEZONE`                  
                                                                                
                                                                                
               [...]
-| [hoodie.table.type](#hoodietabletype)                                        
                    | COPY_ON_WRITE                                             
        | The table type for the underlying data, for this write. This can’t 
change between writes.<br />`Config Param: TYPE`                                
                                                                                
                                                                                
                 [...]
-| [hoodie.table.version](#hoodietableversion)                                  
                    | ZERO                                                      
        | Version of table, used for running upgrade/downgrade steps between 
releases with potentially breaking/backwards compatible changes.<br />`Config 
Param: VERSION`                                                                 
                                                                                
                   [...]
+| Config Name                                                                  
                    | Default                                                   
        | Description                                                           
                                                                                
                                                                                
                                                                                
              [...]
+| 
------------------------------------------------------------------------------------------------
 | ----------------------------------------------------------------- | 
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 [...]
+| [hoodie.bootstrap.base.path](#hoodiebootstrapbasepath)                       
                    | (N/A)                                                     
        | Base path of the dataset that needs to be bootstrapped as a Hudi 
table<br />`Config Param: BOOTSTRAP_BASE_PATH`                                  
                                                                                
                                                                                
                   [...]
+| [hoodie.database.name](#hoodiedatabasename)                                  
                    | (N/A)                                                     
        | Database name that will be used for incremental query.If different 
databases have the same table name during incremental query, we can set it to 
limit the table name under a specific database<br />`Config Param: 
DATABASE_NAME`                                                                  
                                [...]
+| [hoodie.table.checksum](#hoodietablechecksum)                                
                    | (N/A)                                                     
        | Table checksum is used to guard against partial writes in HDFS. It is 
added as the last entry in hoodie.properties and then used to validate while 
reading table config.<br />`Config Param: TABLE_CHECKSUM`<br />`Since Version: 
0.11.0`                                                                         
                  [...]
+| [hoodie.table.create.schema](#hoodietablecreateschema)                       
                    | (N/A)                                                     
        | Schema used when creating the table, for the first time.<br />`Config 
Param: CREATE_SCHEMA`                                                           
                                                                                
                                                                                
              [...]
+| [hoodie.table.keygenerator.class](#hoodietablekeygeneratorclass)             
                    | (N/A)                                                     
        | Key Generator class property for the hoodie table<br />`Config Param: 
KEY_GENERATOR_CLASS_NAME`                                                       
                                                                                
                                                                                
              [...]
+| [hoodie.table.metadata.partitions](#hoodietablemetadatapartitions)           
                    | (N/A)                                                     
        | Comma-separated list of metadata partitions that have been completely 
built and in-sync with data table. These partitions are ready for use by the 
readers<br />`Config Param: TABLE_METADATA_PARTITIONS`<br />`Since Version: 
0.11.0`                                                                         
                     [...]
+| 
[hoodie.table.metadata.partitions.inflight](#hoodietablemetadatapartitionsinflight)
              | (N/A)                                                           
  | Comma-separated list of metadata partitions whose building is in progress. 
These partitions are not yet ready for use by the readers.<br />`Config Param: 
TABLE_METADATA_PARTITIONS_INFLIGHT`<br />`Since Version: 0.11.0`                
                                                                                
          [...]
+| [hoodie.table.name](#hoodietablename)                                        
                    | (N/A)                                                     
        | Table name that will be used for registering with Hive. Needs to be 
same across runs.<br />`Config Param: NAME`                                     
                                                                                
                                                                                
                [...]
+| [hoodie.table.partition.fields](#hoodietablepartitionfields)                 
                    | (N/A)                                                     
        | Fields used to partition the table. Concatenated values of these 
fields are used as the partition path, by invoking toString()<br />`Config 
Param: PARTITION_FIELDS`                                                        
                                                                                
                        [...]
+| [hoodie.table.precombine.field](#hoodietableprecombinefield)                 
                    | (N/A)                                                     
        | Field used in preCombining before actual write. By default, when two 
records have the same key value, the largest value for the precombine field 
determined by Object.compareTo(..), is picked.<br />`Config Param: 
PRECOMBINE_FIELD`                                                               
                                [...]
+| [hoodie.table.recordkey.fields](#hoodietablerecordkeyfields)                 
                    | (N/A)                                                     
        | Columns used to uniquely identify the table. Concatenated values of 
these fields are used as  the record key component of HoodieKey.<br />`Config 
Param: RECORDKEY_FIELDS`                                                        
                                                                                
                  [...]
+| 
[hoodie.table.secondary.indexes.metadata](#hoodietablesecondaryindexesmetadata) 
                 | (N/A)                                                        
     | The metadata of secondary indexes<br />`Config Param: 
SECONDARY_INDEXES_METADATA`<br />`Since Version: 0.13.0`                        
                                                                                
                                                                                
                              [...]
+| [hoodie.timeline.layout.version](#hoodietimelinelayoutversion)               
                    | (N/A)                                                     
        | Version of timeline used, by the table.<br />`Config Param: 
TIMELINE_LAYOUT_VERSION`                                                        
                                                                                
                                                                                
                        [...]
+| [hoodie.archivelog.folder](#hoodiearchivelogfolder)                          
                    | archived                                                  
        | path under the meta folder, to store archived timeline instants 
at.<br />`Config Param: ARCHIVELOG_FOLDER`                                      
                                                                                
                                                                                
                    [...]
+| [hoodie.bootstrap.index.class](#hoodiebootstrapindexclass)                   
                    | 
org.apache.hudi.common.bootstrap.index.hfile.HFileBootstrapIndex  | 
Implementation to use, for mapping base files to bootstrap base file, that 
contain actual data.<br />`Config Param: BOOTSTRAP_INDEX_CLASS_NAME`            
                                                                                
                                                                                
         [...]
+| [hoodie.bootstrap.index.enable](#hoodiebootstrapindexenable)                 
                    | true                                                      
        | Whether or not, this is a bootstrapped table, with bootstrap base 
data and an mapping index defined, default true.<br />`Config Param: 
BOOTSTRAP_INDEX_ENABLE`                                                         
                                                                                
                             [...]
+| [hoodie.compaction.payload.class](#hoodiecompactionpayloadclass)             
                    | org.apache.hudi.common.model.DefaultHoodieRecordPayload   
        | Payload class to use for performing compactions, i.e merge delta logs 
with current base file and then  produce a new base file.<br />`Config Param: 
PAYLOAD_CLASS_NAME`                                                             
                                                                                
                [...]
+| [hoodie.compaction.payload.type](#hoodiecompactionpayloadtype)               
                    | HOODIE_AVRO_DEFAULT                                       
        | org.apache.hudi.common.model.RecordPayloadType: Payload to use for 
merging records     AWS_DMS_AVRO: Provides support for seamlessly applying 
changes captured via Amazon Database Migration Service onto S3.     
HOODIE_AVRO: A payload to wrap a existing Hoodie Avro Record. Useful to create 
a HoodieRecord over existing Gener [...]
+| 
[hoodie.compaction.record.merger.strategy](#hoodiecompactionrecordmergerstrategy)
                | eeb8d96f-b1e4-49fd-bbf8-28ac514178e5                          
    | Id of merger strategy. Hudi will pick HoodieRecordMerger implementations 
in hoodie.datasource.write.record.merger.impls which has the same merger 
strategy id<br />`Config Param: RECORD_MERGER_STRATEGY`<br />`Since Version: 
0.13.0`                                                                         
                     [...]
+| 
[hoodie.datasource.write.hive_style_partitioning](#hoodiedatasourcewritehive_style_partitioning)
 | false                                                             | Flag to 
indicate whether to use Hive style partitioning. If set true, the names of 
partition folders follow &lt;partition_column_name&gt;=&lt;partition_value&gt; 
format. By default false (the names of partition folders are only partition 
values)<br />`Config Param: HIVE_STYLE_PARTITIONING_ENABLE`                     
      [...]
+| 
[hoodie.partition.metafile.use.base.format](#hoodiepartitionmetafileusebaseformat)
               | false                                                          
   | If true, partition metafiles are saved in the same format as base-files 
for this dataset (e.g. Parquet / ORC). If false (default) partition metafiles 
are saved as properties files.<br />`Config Param: 
PARTITION_METAFILE_USE_BASE_FORMAT`                                             
                                           [...]
+| [hoodie.populate.meta.fields](#hoodiepopulatemetafields)                     
                    | true                                                      
        | When enabled, populates all meta fields. When disabled, no meta 
fields are populated and incremental queries will not be functional. This is 
only meant to be used for append only/immutable data for batch processing<br 
/>`Config Param: POPULATE_META_FIELDS`                                          
                          [...]
+| [hoodie.table.base.file.format](#hoodietablebasefileformat)                  
                    | PARQUET                                                   
        | Base file format to store all the base file data.<br />`Config Param: 
BASE_FILE_FORMAT`                                                               
                                                                                
                                                                                
              [...]
+| [hoodie.table.cdc.enabled](#hoodietablecdcenabled)                           
                    | false                                                     
        | When enable, persist the change data if necessary, and can be queried 
as a CDC query mode.<br />`Config Param: CDC_ENABLED`<br />`Since Version: 
0.13.0`                                                                         
                                                                                
                   [...]
+| 
[hoodie.table.cdc.supplemental.logging.mode](#hoodietablecdcsupplementalloggingmode)
             | DATA_BEFORE_AFTER                                                
 | org.apache.hudi.common.table.cdc.HoodieCDCSupplementalLoggingMode: Change 
log capture supplemental logging mode. The supplemental log is used for 
accelerating the generation of change log details.     OP_KEY_ONLY: Only 
keeping record keys in the supplemental logs, so the reader needs to figure out 
the update before image  [...]
+| [hoodie.table.log.file.format](#hoodietablelogfileformat)                    
                    | HOODIE_LOG                                                
        | Log format used for the delta logs.<br />`Config Param: 
LOG_FILE_FORMAT`                                                                
                                                                                
                                                                                
                            [...]
+| [hoodie.table.timeline.timezone](#hoodietabletimelinetimezone)               
                    | LOCAL                                                     
        | User can set hoodie commit timeline timezone, such as utc, local and 
so on. local is default<br />`Config Param: TIMELINE_TIMEZONE`                  
                                                                                
                                                                                
               [...]
+| [hoodie.table.type](#hoodietabletype)                                        
                    | COPY_ON_WRITE                                             
        | The table type for the underlying data, for this write. This can’t 
change between writes.<br />`Config Param: TYPE`                                
                                                                                
                                                                                
                 [...]
+| [hoodie.table.version](#hoodietableversion)                                  
                    | ZERO                                                      
        | Version of table, used for running upgrade/downgrade steps between 
releases with potentially breaking/backwards compatible changes.<br />`Config 
Param: VERSION`                                                                 
                                                                                
                   [...]
 
 [**Advanced Configs**](#Hudi-Table-Basic-Configs-advanced-configs)
 
@@ -181,58 +182,59 @@ Options useful for writing tables via 
`write.format.option(...)`
 [**Advanced Configs**](#Write-Options-advanced-configs)
 
 
-| Config Name                                                                  
                                                                    | Default   
                                                   | Description                
                                                                                
                                                                                
                                                                                
              [...]
-| 
------------------------------------------------------------------------------------------------------------------------------------------------
 | ------------------------------------------------------------ | 
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 [...]
-| 
[hoodie.datasource.hive_sync.serde_properties](#hoodiedatasourcehive_syncserde_properties)
                                                       | (N/A)                  
                                      | Serde properties to hive table.<br 
/>`Config Param: HIVE_TABLE_SERDE_PROPERTIES`                                   
                                                                                
                                                                                
      [...]
-| 
[hoodie.datasource.hive_sync.table_properties](#hoodiedatasourcehive_synctable_properties)
                                                       | (N/A)                  
                                      | Additional properties to store with 
table.<br />`Config Param: HIVE_TABLE_PROPERTIES`                               
                                                                                
                                                                                
     [...]
-| [hoodie.datasource.overwrite.mode](#hoodiedatasourceoverwritemode)           
                                                                    | (N/A)     
                                                   | Controls whether overwrite 
use dynamic or static mode, if not configured, respect 
spark.sql.sources.partitionOverwriteMode<br />`Config Param: OVERWRITE_MODE`<br 
/>`Since Version: 0.14.0`                                                       
                                       [...]
-| 
[hoodie.datasource.write.partitions.to.delete](#hoodiedatasourcewritepartitionstodelete)
                                                         | (N/A)                
                                        | Comma separated list of partitions to 
delete. Allows use of wildcard *<br />`Config Param: PARTITIONS_TO_DELETE`      
                                                                                
                                                                                
   [...]
-| [hoodie.datasource.write.table.name](#hoodiedatasourcewritetablename)        
                                                                    | (N/A)     
                                                   | Table name for the 
datasource write. Also used to register the table into meta stores.<br 
/>`Config Param: TABLE_NAME`                                                    
                                                                                
                               [...]
-| 
[hoodie.datasource.compaction.async.enable](#hoodiedatasourcecompactionasyncenable)
                                                              | true            
                                             | Controls whether async 
compaction should be turned on for MOR table writing.<br />`Config Param: 
ASYNC_COMPACT_ENABLE`                                                           
                                                                                
                        [...]
-| 
[hoodie.datasource.hive_sync.assume_date_partitioning](#hoodiedatasourcehive_syncassume_date_partitioning)
                                       | false                                  
                      | Assume partitioning is yyyy/MM/dd<br />`Config Param: 
HIVE_ASSUME_DATE_PARTITION`                                                     
                                                                                
                                                                   [...]
-| 
[hoodie.datasource.hive_sync.auto_create_database](#hoodiedatasourcehive_syncauto_create_database)
                                               | true                           
                              | Auto create hive database if does not exists<br 
/>`Config Param: HIVE_AUTO_CREATE_DATABASE`                                     
                                                                                
                                                                         [...]
-| 
[hoodie.datasource.hive_sync.base_file_format](#hoodiedatasourcehive_syncbase_file_format)
                                                       | PARQUET                
                                      | Base file format for the sync.<br 
/>`Config Param: HIVE_BASE_FILE_FORMAT`                                         
                                                                                
                                                                                
       [...]
-| [hoodie.datasource.hive_sync.batch_num](#hoodiedatasourcehive_syncbatch_num) 
                                                                    | 1000      
                                                   | The number of partitions 
one batch when synchronous partitions to hive.<br />`Config Param: 
HIVE_BATCH_SYNC_PARTITION_NUM`                                                  
                                                                                
                             [...]
-| 
[hoodie.datasource.hive_sync.bucket_sync](#hoodiedatasourcehive_syncbucket_sync)
                                                                 | false        
                                                | Whether sync hive metastore 
bucket specification when using bucket index.The specification is 'CLUSTERED BY 
(trace_id) SORTED BY (trace_id ASC) INTO 65536 BUCKETS'<br />`Config Param: 
HIVE_SYNC_BUCKET_SYNC`                                                          
                 [...]
-| 
[hoodie.datasource.hive_sync.create_managed_table](#hoodiedatasourcehive_synccreate_managed_table)
                                               | false                          
                              | Whether to sync the table as managed table.<br 
/>`Config Param: HIVE_CREATE_MANAGED_TABLE`                                     
                                                                                
                                                                          [...]
-| [hoodie.datasource.hive_sync.database](#hoodiedatasourcehive_syncdatabase)   
                                                                    | default   
                                                   | The name of the 
destination database that we should sync the hudi table to.<br />`Config Param: 
HIVE_DATABASE`                                                                  
                                                                                
                         [...]
-| 
[hoodie.datasource.hive_sync.ignore_exceptions](#hoodiedatasourcehive_syncignore_exceptions)
                                                     | false                    
                                    | Ignore exceptions when syncing with 
Hive.<br />`Config Param: HIVE_IGNORE_EXCEPTIONS`                               
                                                                                
                                                                                
     [...]
-| 
[hoodie.datasource.hive_sync.partition_extractor_class](#hoodiedatasourcehive_syncpartition_extractor_class)
                                     | 
org.apache.hudi.hive.MultiPartKeysValueExtractor             | Class which 
implements PartitionValueExtractor to extract the partition values, default 
'org.apache.hudi.hive.MultiPartKeysValueExtractor'.<br />`Config Param: 
HIVE_PARTITION_EXTRACTOR_CLASS`                                                 
                                         [...]
-| 
[hoodie.datasource.hive_sync.partition_fields](#hoodiedatasourcehive_syncpartition_fields)
                                                       |                        
                                      | Field in the table to use for 
determining hive partition columns.<br />`Config Param: HIVE_PARTITION_FIELDS`  
                                                                                
                                                                                
           [...]
-| [hoodie.datasource.hive_sync.password](#hoodiedatasourcehive_syncpassword)   
                                                                    | hive      
                                                   | hive password to use<br 
/>`Config Param: HIVE_PASS`                                                     
                                                                                
                                                                                
                 [...]
-| 
[hoodie.datasource.hive_sync.skip_ro_suffix](#hoodiedatasourcehive_syncskip_ro_suffix)
                                                           | false              
                                          | Skip the _ro suffix for Read 
optimized table, when registering<br />`Config Param: 
HIVE_SKIP_RO_SUFFIX_FOR_READ_OPTIMIZED_TABLE`                                   
                                                                                
                                      [...]
-| 
[hoodie.datasource.hive_sync.support_timestamp](#hoodiedatasourcehive_syncsupport_timestamp)
                                                     | false                    
                                    | ‘INT64’ with original type 
TIMESTAMP_MICROS is converted to hive ‘timestamp’ type. Disabled by default for 
backward compatibility.  NOTE: On Spark entrypoints, this is defaulted to 
TRUE<br />`Config Param: HIVE_SUPPORT_TIMESTAMP_TYPE`                           
                    [...]
-| 
[hoodie.datasource.hive_sync.sync_as_datasource](#hoodiedatasourcehive_syncsync_as_datasource)
                                                   | true                       
                                  | <br />`Config Param: 
HIVE_SYNC_AS_DATA_SOURCE_TABLE`                                                 
                                                                                
                                                                                
                    [...]
-| 
[hoodie.datasource.hive_sync.sync_comment](#hoodiedatasourcehive_syncsync_comment)
                                                               | false          
                                              | Whether to sync the table 
column comments while syncing the table.<br />`Config Param: HIVE_SYNC_COMMENT` 
                                                                                
                                                                                
               [...]
-| [hoodie.datasource.hive_sync.table](#hoodiedatasourcehive_synctable)         
                                                                    | unknown   
                                                   | The name of the 
destination table that we should sync the hudi table to.<br />`Config Param: 
HIVE_TABLE`                                                                     
                                                                                
                            [...]
-| [hoodie.datasource.hive_sync.use_jdbc](#hoodiedatasourcehive_syncuse_jdbc)   
                                                                    | true      
                                                   | Use JDBC when hive 
synchronization is enabled<br />`Config Param: HIVE_USE_JDBC`                   
                                                                                
                                                                                
                      [...]
-| 
[hoodie.datasource.hive_sync.use_pre_apache_input_format](#hoodiedatasourcehive_syncuse_pre_apache_input_format)
                                 | false                                        
                | Flag to choose InputFormat under com.uber.hoodie package 
instead of org.apache.hudi package. Use this when you are in the process of 
migrating from com.uber.hoodie to org.apache.hudi. Stop using this after you 
migrated the table definition to org.apache.hudi input format<br />`Co [...]
-| [hoodie.datasource.hive_sync.username](#hoodiedatasourcehive_syncusername)   
                                                                    | hive      
                                                   | hive user name to use<br 
/>`Config Param: HIVE_USER`                                                     
                                                                                
                                                                                
                [...]
-| [hoodie.datasource.insert.dup.policy](#hoodiedatasourceinsertduppolicy)      
                                                                    | none      
                                                   | **Note** This is only 
applicable to Spark SQL writing.&lt;br /&gt;When operation type is set to 
"insert", users can optionally enforce a dedup policy. This policy will be 
employed  when records being ingested already exists in storage. Default policy 
is none and no action will be [...]
-| 
[hoodie.datasource.meta_sync.condition.sync](#hoodiedatasourcemeta_syncconditionsync)
                                                            | false             
                                           | If true, only sync on conditions 
like schema change or partition change.<br />`Config Param: 
HIVE_CONDITIONAL_SYNC`                                                          
                                                                                
                            [...]
-| 
[hoodie.datasource.write.commitmeta.key.prefix](#hoodiedatasourcewritecommitmetakeyprefix)
                                                       | _                      
                                      | Option keys beginning with this prefix, 
are automatically added to the commit/deltacommit metadata. This is useful to 
store checkpointing information, in a consistent way with the hudi timeline<br 
/>`Config Param: COMMIT_METADATA_KEYPREFIX`                                     
    [...]
-| 
[hoodie.datasource.write.drop.partition.columns](#hoodiedatasourcewritedroppartitioncolumns)
                                                     | false                    
                                    | When set to true, will not write the 
partition columns into hudi. By default, false.<br />`Config Param: 
DROP_PARTITION_COLUMNS`                                                         
                                                                                
                [...]
-| 
[hoodie.datasource.write.insert.drop.duplicates](#hoodiedatasourcewriteinsertdropduplicates)
                                                     | false                    
                                    | If set to true, records from the incoming 
dataframe will not overwrite existing records with the same key during the 
write operation. &lt;br /&gt; **Note** Just for Insert operation in Spark SQL 
writing since 0.14.0, users can switch to the config 
`hoodie.datasource.insert.dup.po [...]
-| 
[hoodie.datasource.write.keygenerator.class](#hoodiedatasourcewritekeygeneratorclass)
                                                            | 
org.apache.hudi.keygen.SimpleKeyGenerator                    | Key generator 
class, that implements `org.apache.hudi.keygen.KeyGenerator`<br />`Config 
Param: KEYGENERATOR_CLASS_NAME`                                                 
                                                                                
                                 [...]
-| 
[hoodie.datasource.write.keygenerator.consistent.logical.timestamp.enabled](#hoodiedatasourcewritekeygeneratorconsistentlogicaltimestampenabled)
 | false                                                        | When set to 
true, consistent value will be generated for a logical timestamp type column, 
like timestamp-millis and timestamp-micros, irrespective of whether row-writer 
is enabled. Disabled by default so as not to break the pipeline that deploy 
either fully row-writer path or non [...]
-| 
[hoodie.datasource.write.partitionpath.urlencode](#hoodiedatasourcewritepartitionpathurlencode)
                                                  | false                       
                                 | Should we url encode the partition path 
value, before creating the folder structure.<br />`Config Param: 
URL_ENCODE_PARTITIONING`                                                        
                                                                                
                [...]
-| [hoodie.datasource.write.payload.class](#hoodiedatasourcewritepayloadclass)  
                                                                    | 
org.apache.hudi.common.model.OverwriteWithLatestAvroPayload  | Payload class 
used. Override this, if you like to roll your own merge logic, when 
upserting/inserting. This will render any value set for 
PRECOMBINE_FIELD_OPT_VAL in-effective<br />`Config Param: PAYLOAD_CLASS_NAME`   
                                                               [...]
-| 
[hoodie.datasource.write.reconcile.schema](#hoodiedatasourcewritereconcileschema)
                                                                | false         
                                               | This config controls how 
writer's schema will be selected based on the incoming batch's schema as well 
as existing table's one. When schema reconciliation is DISABLED, incoming 
batch's schema will be picked as a writer-schema (therefore updating table's 
schema). When schema recon [...]
-| 
[hoodie.datasource.write.record.merger.impls](#hoodiedatasourcewriterecordmergerimpls)
                                                           | 
org.apache.hudi.common.model.HoodieAvroRecordMerger          | List of 
HoodieMerger implementations constituting Hudi's merging strategy -- based on 
the engine used. These merger impls will filter by 
hoodie.datasource.write.record.merger.strategy Hudi will pick most efficient 
implementation to perform merging/combining of the records (during [...]
-| 
[hoodie.datasource.write.record.merger.strategy](#hoodiedatasourcewriterecordmergerstrategy)
                                                     | 
eeb8d96f-b1e4-49fd-bbf8-28ac514178e5                         | Id of merger 
strategy. Hudi will pick HoodieRecordMerger implementations in 
hoodie.datasource.write.record.merger.impls which has the same merger strategy 
id<br />`Config Param: RECORD_MERGER_STRATEGY`<br />`Since Version: 0.13.0`     
                                              [...]
-| 
[hoodie.datasource.write.row.writer.enable](#hoodiedatasourcewriterowwriterenable)
                                                               | true           
                                              | When set to true, will perform 
write operations directly using the spark native `Row` representation, avoiding 
any additional conversion costs.<br />`Config Param: ENABLE_ROW_WRITER`         
                                                                                
          [...]
-| 
[hoodie.datasource.write.streaming.checkpoint.identifier](#hoodiedatasourcewritestreamingcheckpointidentifier)
                                   | default_single_writer                      
                  | A stream identifier used for HUDI to fetch the right 
checkpoint(`batch id` to be more specific) corresponding this writer. Please 
note that keep the identifier an unique value for different writer if under 
multi-writer scenario. If the value is not set, will only keep the checkpo [...]
-| 
[hoodie.datasource.write.streaming.disable.compaction](#hoodiedatasourcewritestreamingdisablecompaction)
                                         | false                                
                        | By default for MOR table, async compaction is enabled 
with spark streaming sink. By setting this config to true, we can disable it 
and the expectation is that, users will schedule and execute compaction in a 
different process/job altogether. Some users may wish to run it separate [...]
-| 
[hoodie.datasource.write.streaming.ignore.failed.batch](#hoodiedatasourcewritestreamingignorefailedbatch)
                                        | false                                 
                       | Config to indicate whether to ignore any non exception 
error (e.g. writestatus error) within a streaming microbatch. Turning this on, 
could hide the write status errors while the spark checkpoint moves ahead.So, 
would recommend users to use this with caution.<br />`Config Param:  [...]
-| 
[hoodie.datasource.write.streaming.retry.count](#hoodiedatasourcewritestreamingretrycount)
                                                       | 3                      
                                      | Config to indicate how many times 
streaming job should retry for a failed micro batch.<br />`Config Param: 
STREAMING_RETRY_CNT`                                                            
                                                                                
              [...]
-| 
[hoodie.datasource.write.streaming.retry.interval.ms](#hoodiedatasourcewritestreamingretryintervalms)
                                            | 2000                              
                           |  Config to indicate how long (by millisecond) 
before a retry should issued for failed microbatch<br />`Config Param: 
STREAMING_RETRY_INTERVAL_MS`                                                    
                                                                                
    [...]
-| [hoodie.meta.sync.client.tool.class](#hoodiemetasyncclienttoolclass)         
                                                                    | 
org.apache.hudi.hive.HiveSyncTool                            | Sync tool class 
name used to sync to metastore. Defaults to Hive.<br />`Config Param: 
META_SYNC_CLIENT_TOOL_CLASS_NAME`                                               
                                                                                
                                   [...]
-| [hoodie.spark.sql.insert.into.operation](#hoodiesparksqlinsertintooperation) 
                                                                    | insert    
                                                   | Sql write operation to use 
with INSERT_INTO spark sql command. This comes with 3 possible values, 
bulk_insert, insert and upsert. bulk_insert is generally meant for initial 
loads and is known to be performant compared to insert. But bulk_insert may not 
do small file management. I [...]
-| 
[hoodie.spark.sql.optimized.writes.enable](#hoodiesparksqloptimizedwritesenable)
                                                                 | true         
                                                | Controls whether spark sql 
prepped update, delete, and merge are enabled.<br />`Config Param: 
SPARK_SQL_OPTIMIZED_WRITES`<br />`Since Version: 0.14.0`                        
                                                                                
                           [...]
-| [hoodie.sql.bulk.insert.enable](#hoodiesqlbulkinsertenable)                  
                                                                    | false     
                                                   | When set to true, the sql 
insert statement will use bulk insert. This config is deprecated as of 0.14.0. 
Please use hoodie.spark.sql.insert.into.operation instead.<br />`Config Param: 
SQL_ENABLE_BULK_INSERT`                                                         
                 [...]
-| [hoodie.sql.insert.mode](#hoodiesqlinsertmode)                               
                                                                    | upsert    
                                                   | Insert mode when insert 
data to pk-table. The optional modes are: upsert, strict and non-strict.For 
upsert mode, insert statement do the upsert operation for the pk-table which 
will update the duplicate record.For strict mode, insert statement will keep 
the primary key uniqueness [...]
-| 
[hoodie.streamer.source.kafka.value.deserializer.class](#hoodiestreamersourcekafkavaluedeserializerclass)
                                        | 
io.confluent.kafka.serializers.KafkaAvroDeserializer         | This class is 
used by kafka client to deserialize the records<br />`Config Param: 
KAFKA_AVRO_VALUE_DESERIALIZER_CLASS`<br />`Since Version: 0.9.0`                
                                                                                
                                       [...]
-| 
[hoodie.write.set.null.for.missing.columns](#hoodiewritesetnullformissingcolumns)
                                                                | false         
                                               | When a nullable column is 
missing from incoming batch during a write operation, the write  operation will 
fail schema compatibility check. Set this option to true will make the missing  
column be filled with null values to successfully complete the write 
operation.<br />`Config P [...]
+| Config Name                                                                  
                                                                    | Default   
                                               | Description                    
                                                                                
                                                                                
                                                                                
              [...]
+| 
------------------------------------------------------------------------------------------------------------------------------------------------
 | -------------------------------------------------------- | 
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 [...]
+| 
[hoodie.datasource.hive_sync.serde_properties](#hoodiedatasourcehive_syncserde_properties)
                                                       | (N/A)                  
                                  | Serde properties to hive table.<br 
/>`Config Param: HIVE_TABLE_SERDE_PROPERTIES`                                   
                                                                                
                                                                                
          [...]
+| 
[hoodie.datasource.hive_sync.table_properties](#hoodiedatasourcehive_synctable_properties)
                                                       | (N/A)                  
                                  | Additional properties to store with 
table.<br />`Config Param: HIVE_TABLE_PROPERTIES`                               
                                                                                
                                                                                
         [...]
+| [hoodie.datasource.overwrite.mode](#hoodiedatasourceoverwritemode)           
                                                                    | (N/A)     
                                               | Controls whether overwrite use 
dynamic or static mode, if not configured, respect 
spark.sql.sources.partitionOverwriteMode<br />`Config Param: OVERWRITE_MODE`<br 
/>`Since Version: 0.14.0`                                                       
                                           [...]
+| 
[hoodie.datasource.write.partitions.to.delete](#hoodiedatasourcewritepartitionstodelete)
                                                         | (N/A)                
                                    | Comma separated list of partitions to 
delete. Allows use of wildcard *<br />`Config Param: PARTITIONS_TO_DELETE`      
                                                                                
                                                                                
       [...]
+| [hoodie.datasource.write.table.name](#hoodiedatasourcewritetablename)        
                                                                    | (N/A)     
                                               | Table name for the datasource 
write. Also used to register the table into meta stores.<br />`Config Param: 
TABLE_NAME`                                                                     
                                                                                
                  [...]
+| 
[hoodie.datasource.compaction.async.enable](#hoodiedatasourcecompactionasyncenable)
                                                              | true            
                                         | Controls whether async compaction 
should be turned on for MOR table writing.<br />`Config Param: 
ASYNC_COMPACT_ENABLE`                                                           
                                                                                
                            [...]
+| 
[hoodie.datasource.hive_sync.assume_date_partitioning](#hoodiedatasourcehive_syncassume_date_partitioning)
                                       | false                                  
                  | Assume partitioning is yyyy/MM/dd<br />`Config Param: 
HIVE_ASSUME_DATE_PARTITION`                                                     
                                                                                
                                                                       [...]
+| 
[hoodie.datasource.hive_sync.auto_create_database](#hoodiedatasourcehive_syncauto_create_database)
                                               | true                           
                          | Auto create hive database if does not exists<br 
/>`Config Param: HIVE_AUTO_CREATE_DATABASE`                                     
                                                                                
                                                                             
[...]
+| 
[hoodie.datasource.hive_sync.base_file_format](#hoodiedatasourcehive_syncbase_file_format)
                                                       | PARQUET                
                                  | Base file format for the sync.<br />`Config 
Param: HIVE_BASE_FILE_FORMAT`                                                   
                                                                                
                                                                                
 [...]
+| [hoodie.datasource.hive_sync.batch_num](#hoodiedatasourcehive_syncbatch_num) 
                                                                    | 1000      
                                               | The number of partitions one 
batch when synchronous partitions to hive.<br />`Config Param: 
HIVE_BATCH_SYNC_PARTITION_NUM`                                                  
                                                                                
                                 [...]
+| 
[hoodie.datasource.hive_sync.bucket_sync](#hoodiedatasourcehive_syncbucket_sync)
                                                                 | false        
                                            | Whether sync hive metastore 
bucket specification when using bucket index.The specification is 'CLUSTERED BY 
(trace_id) SORTED BY (trace_id ASC) INTO 65536 BUCKETS'<br />`Config Param: 
HIVE_SYNC_BUCKET_SYNC`                                                          
                     [...]
+| 
[hoodie.datasource.hive_sync.create_managed_table](#hoodiedatasourcehive_synccreate_managed_table)
                                               | false                          
                          | Whether to sync the table as managed table.<br 
/>`Config Param: HIVE_CREATE_MANAGED_TABLE`                                     
                                                                                
                                                                              
[...]
+| [hoodie.datasource.hive_sync.database](#hoodiedatasourcehive_syncdatabase)   
                                                                    | default   
                                               | The name of the destination 
database that we should sync the hudi table to.<br />`Config Param: 
HIVE_DATABASE`                                                                  
                                                                                
                             [...]
+| 
[hoodie.datasource.hive_sync.ignore_exceptions](#hoodiedatasourcehive_syncignore_exceptions)
                                                     | false                    
                                | Ignore exceptions when syncing with Hive.<br 
/>`Config Param: HIVE_IGNORE_EXCEPTIONS`                                        
                                                                                
                                                                                
[...]
+| 
[hoodie.datasource.hive_sync.partition_extractor_class](#hoodiedatasourcehive_syncpartition_extractor_class)
                                     | 
org.apache.hudi.hive.MultiPartKeysValueExtractor         | Class which 
implements PartitionValueExtractor to extract the partition values, default 
'org.apache.hudi.hive.MultiPartKeysValueExtractor'.<br />`Config Param: 
HIVE_PARTITION_EXTRACTOR_CLASS`                                                 
                                             [...]
+| 
[hoodie.datasource.hive_sync.partition_fields](#hoodiedatasourcehive_syncpartition_fields)
                                                       |                        
                                  | Field in the table to use for determining 
hive partition columns.<br />`Config Param: HIVE_PARTITION_FIELDS`              
                                                                                
                                                                                
   [...]
+| [hoodie.datasource.hive_sync.password](#hoodiedatasourcehive_syncpassword)   
                                                                    | hive      
                                               | hive password to use<br 
/>`Config Param: HIVE_PASS`                                                     
                                                                                
                                                                                
                     [...]
+| 
[hoodie.datasource.hive_sync.skip_ro_suffix](#hoodiedatasourcehive_syncskip_ro_suffix)
                                                           | false              
                                      | Skip the _ro suffix for Read optimized 
table, when registering<br />`Config Param: 
HIVE_SKIP_RO_SUFFIX_FOR_READ_OPTIMIZED_TABLE`                                   
                                                                                
                                          [...]
+| 
[hoodie.datasource.hive_sync.support_timestamp](#hoodiedatasourcehive_syncsupport_timestamp)
                                                     | false                    
                                | ‘INT64’ with original type TIMESTAMP_MICROS 
is converted to hive ‘timestamp’ type. Disabled by default for backward 
compatibility.  NOTE: On Spark entrypoints, this is defaulted to TRUE<br 
/>`Config Param: HIVE_SUPPORT_TIMESTAMP_TYPE`                                   
                [...]
+| 
[hoodie.datasource.hive_sync.sync_as_datasource](#hoodiedatasourcehive_syncsync_as_datasource)
                                                   | true                       
                              | <br />`Config Param: 
HIVE_SYNC_AS_DATA_SOURCE_TABLE`                                                 
                                                                                
                                                                                
                        [...]
+| 
[hoodie.datasource.hive_sync.sync_comment](#hoodiedatasourcehive_syncsync_comment)
                                                               | false          
                                          | Whether to sync the table column 
comments while syncing the table.<br />`Config Param: HIVE_SYNC_COMMENT`        
                                                                                
                                                                                
            [...]
+| [hoodie.datasource.hive_sync.table](#hoodiedatasourcehive_synctable)         
                                                                    | unknown   
                                               | The name of the destination 
table that we should sync the hudi table to.<br />`Config Param: HIVE_TABLE`    
                                                                                
                                                                                
                 [...]
+| [hoodie.datasource.hive_sync.use_jdbc](#hoodiedatasourcehive_syncuse_jdbc)   
                                                                    | true      
                                               | Use JDBC when hive 
synchronization is enabled<br />`Config Param: HIVE_USE_JDBC`                   
                                                                                
                                                                                
                          [...]
+| 
[hoodie.datasource.hive_sync.use_pre_apache_input_format](#hoodiedatasourcehive_syncuse_pre_apache_input_format)
                                 | false                                        
            | Flag to choose InputFormat under com.uber.hoodie package instead 
of org.apache.hudi package. Use this when you are in the process of migrating 
from com.uber.hoodie to org.apache.hudi. Stop using this after you migrated the 
table definition to org.apache.hudi input format<br />`Config [...]
+| [hoodie.datasource.hive_sync.username](#hoodiedatasourcehive_syncusername)   
                                                                    | hive      
                                               | hive user name to use<br 
/>`Config Param: HIVE_USER`                                                     
                                                                                
                                                                                
                    [...]
+| [hoodie.datasource.insert.dup.policy](#hoodiedatasourceinsertduppolicy)      
                                                                    | none      
                                               | **Note** This is only 
applicable to Spark SQL writing.&lt;br /&gt;When operation type is set to 
"insert", users can optionally enforce a dedup policy. This policy will be 
employed  when records being ingested already exists in storage. Default policy 
is none and no action will be tak [...]
+| 
[hoodie.datasource.meta_sync.condition.sync](#hoodiedatasourcemeta_syncconditionsync)
                                                            | false             
                                       | If true, only sync on conditions like 
schema change or partition change.<br />`Config Param: HIVE_CONDITIONAL_SYNC`   
                                                                                
                                                                                
       [...]
+| 
[hoodie.datasource.write.commitmeta.key.prefix](#hoodiedatasourcewritecommitmetakeyprefix)
                                                       | _                      
                                  | Option keys beginning with this prefix, are 
automatically added to the commit/deltacommit metadata. This is useful to store 
checkpointing information, in a consistent way with the hudi timeline<br 
/>`Config Param: COMMIT_METADATA_KEYPREFIX`                                     
        [...]
+| 
[hoodie.datasource.write.drop.partition.columns](#hoodiedatasourcewritedroppartitioncolumns)
                                                     | false                    
                                | When set to true, will not write the 
partition columns into hudi. By default, false.<br />`Config Param: 
DROP_PARTITION_COLUMNS`                                                         
                                                                                
                    [...]
+| 
[hoodie.datasource.write.insert.drop.duplicates](#hoodiedatasourcewriteinsertdropduplicates)
                                                     | false                    
                                | If set to true, records from the incoming 
dataframe will not overwrite existing records with the same key during the 
write operation. &lt;br /&gt; **Note** Just for Insert operation in Spark SQL 
writing since 0.14.0, users can switch to the config 
`hoodie.datasource.insert.dup.policy [...]
+| 
[hoodie.datasource.write.keygenerator.class](#hoodiedatasourcewritekeygeneratorclass)
                                                            | 
org.apache.hudi.keygen.SimpleKeyGenerator                | Key generator class, 
that implements `org.apache.hudi.keygen.KeyGenerator`<br />`Config Param: 
KEYGENERATOR_CLASS_NAME`                                                        
                                                                                
                              [...]
+| 
[hoodie.datasource.write.keygenerator.consistent.logical.timestamp.enabled](#hoodiedatasourcewritekeygeneratorconsistentlogicaltimestampenabled)
 | false                                                    | When set to true, 
consistent value will be generated for a logical timestamp type column, like 
timestamp-millis and timestamp-micros, irrespective of whether row-writer is 
enabled. Disabled by default so as not to break the pipeline that deploy either 
fully row-writer path or non row [...]
+| 
[hoodie.datasource.write.partitionpath.urlencode](#hoodiedatasourcewritepartitionpathurlencode)
                                                  | false                       
                             | Should we url encode the partition path value, 
before creating the folder structure.<br />`Config Param: 
URL_ENCODE_PARTITIONING`                                                        
                                                                                
                    [...]
+| [hoodie.datasource.write.payload.class](#hoodiedatasourcewritepayloadclass)  
                                                                    | 
org.apache.hudi.common.model.DefaultHoodieRecordPayload  | Payload class used. 
Override this, if you like to roll your own merge logic, when 
upserting/inserting. This will render any value set for 
PRECOMBINE_FIELD_OPT_VAL in-effective<br />`Config Param: PAYLOAD_CLASS_NAME`   
                                                                   [...]
+| [hoodie.datasource.write.payload.type](#hoodiedatasourcewritepayloadtype)    
                                                                    | 
HOODIE_AVRO_DEFAULT                                      | 
org.apache.hudi.common.model.RecordPayloadType: Payload to use for merging 
records     AWS_DMS_AVRO: Provides support for seamlessly applying changes 
captured via Amazon Database Migration Service onto S3.     HOODIE_AVRO: A 
payload to wrap a existing Hoodie Avro Record. Useful to cr [...]
+| 
[hoodie.datasource.write.reconcile.schema](#hoodiedatasourcewritereconcileschema)
                                                                | false         
                                           | This config controls how writer's 
schema will be selected based on the incoming batch's schema as well as 
existing table's one. When schema reconciliation is DISABLED, incoming batch's 
schema will be picked as a writer-schema (therefore updating table's schema). 
When schema reconcili [...]
+| 
[hoodie.datasource.write.record.merger.impls](#hoodiedatasourcewriterecordmergerimpls)
                                                           | 
org.apache.hudi.common.model.HoodieAvroRecordMerger      | List of HoodieMerger 
implementations constituting Hudi's merging strategy -- based on the engine 
used. These merger impls will filter by 
hoodie.datasource.write.record.merger.strategy Hudi will pick most efficient 
implementation to perform merging/combining of the records (during upd [...]
+| 
[hoodie.datasource.write.record.merger.strategy](#hoodiedatasourcewriterecordmergerstrategy)
                                                     | 
eeb8d96f-b1e4-49fd-bbf8-28ac514178e5                     | Id of merger 
strategy. Hudi will pick HoodieRecordMerger implementations in 
hoodie.datasource.write.record.merger.impls which has the same merger strategy 
id<br />`Config Param: RECORD_MERGER_STRATEGY`<br />`Since Version: 0.13.0`     
                                                  [...]
+| 
[hoodie.datasource.write.row.writer.enable](#hoodiedatasourcewriterowwriterenable)
                                                               | true           
                                          | When set to true, will perform 
write operations directly using the spark native `Row` representation, avoiding 
any additional conversion costs.<br />`Config Param: ENABLE_ROW_WRITER`         
                                                                                
              [...]
+| 
[hoodie.datasource.write.streaming.checkpoint.identifier](#hoodiedatasourcewritestreamingcheckpointidentifier)
                                   | default_single_writer                      
              | A stream identifier used for HUDI to fetch the right 
checkpoint(`batch id` to be more specific) corresponding this writer. Please 
note that keep the identifier an unique value for different writer if under 
multi-writer scenario. If the value is not set, will only keep the checkpoint  
[...]
+| 
[hoodie.datasource.write.streaming.disable.compaction](#hoodiedatasourcewritestreamingdisablecompaction)
                                         | false                                
                    | By default for MOR table, async compaction is enabled 
with spark streaming sink. By setting this config to true, we can disable it 
and the expectation is that, users will schedule and execute compaction in a 
different process/job altogether. Some users may wish to run it separately t 
[...]
+| 
[hoodie.datasource.write.streaming.ignore.failed.batch](#hoodiedatasourcewritestreamingignorefailedbatch)
                                        | false                                 
                   | Config to indicate whether to ignore any non exception 
error (e.g. writestatus error) within a streaming microbatch. Turning this on, 
could hide the write status errors while the spark checkpoint moves ahead.So, 
would recommend users to use this with caution.<br />`Config Param: STRE [...]
+| 
[hoodie.datasource.write.streaming.retry.count](#hoodiedatasourcewritestreamingretrycount)
                                                       | 3                      
                                  | Config to indicate how many times streaming 
job should retry for a failed micro batch.<br />`Config Param: 
STREAMING_RETRY_CNT`                                                            
                                                                                
                  [...]
+| 
[hoodie.datasource.write.streaming.retry.interval.ms](#hoodiedatasourcewritestreamingretryintervalms)
                                            | 2000                              
                       |  Config to indicate how long (by millisecond) before a 
retry should issued for failed microbatch<br />`Config Param: 
STREAMING_RETRY_INTERVAL_MS`                                                    
                                                                                
        [...]
+| [hoodie.meta.sync.client.tool.class](#hoodiemetasyncclienttoolclass)         
                                                                    | 
org.apache.hudi.hive.HiveSyncTool                        | Sync tool class name 
used to sync to metastore. Defaults to Hive.<br />`Config Param: 
META_SYNC_CLIENT_TOOL_CLASS_NAME`                                               
                                                                                
                                       [...]
+| [hoodie.spark.sql.insert.into.operation](#hoodiesparksqlinsertintooperation) 
                                                                    | insert    
                                               | Sql write operation to use 
with INSERT_INTO spark sql command. This comes with 3 possible values, 
bulk_insert, insert and upsert. bulk_insert is generally meant for initial 
loads and is known to be performant compared to insert. But bulk_insert may not 
do small file management. If yo [...]
+| 
[hoodie.spark.sql.optimized.writes.enable](#hoodiesparksqloptimizedwritesenable)
                                                                 | true         
                                            | Controls whether spark sql 
prepped update, delete, and merge are enabled.<br />`Config Param: 
SPARK_SQL_OPTIMIZED_WRITES`<br />`Since Version: 0.14.0`                        
                                                                                
                               [...]
+| [hoodie.sql.bulk.insert.enable](#hoodiesqlbulkinsertenable)                  
                                                                    | false     
                                               | When set to true, the sql 
insert statement will use bulk insert. This config is deprecated as of 0.14.0. 
Please use hoodie.spark.sql.insert.into.operation instead.<br />`Config Param: 
SQL_ENABLE_BULK_INSERT`                                                         
                     [...]
+| [hoodie.sql.insert.mode](#hoodiesqlinsertmode)                               
                                                                    | upsert    
                                               | Insert mode when insert data 
to pk-table. The optional modes are: upsert, strict and non-strict.For upsert 
mode, insert statement do the upsert operation for the pk-table which will 
update the duplicate record.For strict mode, insert statement will keep the 
primary key uniqueness con [...]
+| 
[hoodie.streamer.source.kafka.value.deserializer.class](#hoodiestreamersourcekafkavaluedeserializerclass)
                                        | 
io.confluent.kafka.serializers.KafkaAvroDeserializer     | This class is used 
by kafka client to deserialize the records<br />`Config Param: 
KAFKA_AVRO_VALUE_DESERIALIZER_CLASS`<br />`Since Version: 0.9.0`                
                                                                                
                                           [...]
+| 
[hoodie.write.set.null.for.missing.columns](#hoodiewritesetnullformissingcolumns)
                                                                | false         
                                           | When a nullable column is missing 
from incoming batch during a write operation, the write  operation will fail 
schema compatibility check. Set this option to true will make the missing  
column be filled with null values to successfully complete the write 
operation.<br />`Config Param [...]
 ---
 
 
@@ -936,7 +938,8 @@ Configurations that control write behavior on Hudi tables. 
These can be directly
 | [hoodie.consistency.check.max_checks](#hoodieconsistencycheckmax_checks)     
                                                     | 7                        
                                    | Maximum number of checks, for consistency 
of written data.<br />`Config Param: MAX_CONSISTENCY_CHECKS`                    
                                                                                
                                                                                
              [...]
 | 
[hoodie.consistency.check.max_interval_ms](#hoodieconsistencycheckmax_interval_ms)
                                                | 300000                        
                               | Max time to wait between successive attempts 
at performing consistency checks<br />`Config Param: 
MAX_CONSISTENCY_CHECK_INTERVAL_MS`                                              
                                                                                
                                      [...]
 | 
[hoodie.datasource.write.keygenerator.type](#hoodiedatasourcewritekeygeneratortype)
                                               | SIMPLE                         
                              | **Note** This is being actively worked on. 
Please use `hoodie.datasource.write.keygenerator.class` instead. 
org.apache.hudi.keygen.constant.KeyGeneratorType: Key generator type, 
indicating the key generator class to use, that implements 
`org.apache.hudi.keygen.KeyGenerator`.     SIMPLE(default) [...]
-| [hoodie.datasource.write.payload.class](#hoodiedatasourcewritepayloadclass)  
                                                     | 
org.apache.hudi.common.model.OverwriteWithLatestAvroPayload  | Payload class 
used. Override this, if you like to roll your own merge logic, when 
upserting/inserting. This will render any value set for 
PRECOMBINE_FIELD_OPT_VAL in-effective<br />`Config Param: 
WRITE_PAYLOAD_CLASS_NAME`                                                       
                    [...]
+| [hoodie.datasource.write.payload.class](#hoodiedatasourcewritepayloadclass)  
                                                     | 
org.apache.hudi.common.model.DefaultHoodieRecordPayload      | Payload class 
used. Override this, if you like to roll your own merge logic, when 
upserting/inserting. This will render any value set for 
PRECOMBINE_FIELD_OPT_VAL in-effective<br />`Config Param: 
WRITE_PAYLOAD_CLASS_NAME`                                                       
                    [...]
+| [hoodie.datasource.write.payload.type](#hoodiedatasourcewritepayloadtype)    
                                                     | HOODIE_AVRO_DEFAULT      
                                | 
org.apache.hudi.common.model.RecordPayloadType: Payload to use for merging 
records     AWS_DMS_AVRO: Provides support for seamlessly applying changes 
captured via Amazon Database Migration Service onto S3.     HOODIE_AVRO: A 
payload to wrap a existing Hoodie Avro Record. Useful to create a HoodieRe [...]
 | 
[hoodie.datasource.write.record.merger.impls](#hoodiedatasourcewriterecordmergerimpls)
                                            | 
org.apache.hudi.common.model.HoodieAvroRecordMerger          | List of 
HoodieMerger implementations constituting Hudi's merging strategy -- based on 
the engine used. These merger impls will filter by 
hoodie.datasource.write.record.merger.strategy Hudi will pick most efficient 
implementation to perform merging/combining of the records (during update, 
readin [...]
 | 
[hoodie.datasource.write.record.merger.strategy](#hoodiedatasourcewriterecordmergerstrategy)
                                      | eeb8d96f-b1e4-49fd-bbf8-28ac514178e5    
                     | Id of merger strategy. Hudi will pick HoodieRecordMerger 
implementations in hoodie.datasource.write.record.merger.impls which has the 
same merger strategy id<br />`Config Param: RECORD_MERGER_STRATEGY`<br />`Since 
Version: 0.13.0`                                                                
  [...]
 | 
[hoodie.datasource.write.schema.allow.auto.evolution.column.drop](#hoodiedatasourcewriteschemaallowautoevolutioncolumndrop)
       | false                                                        | 
Controls whether table's schema is allowed to automatically evolve when 
incoming batch's schema can have any of the columns dropped. By default, Hudi 
will not allow this kind of (auto) schema evolution. Set this config to true to 
allow table's schema to be updated automatically when columns are [...]
@@ -1691,7 +1694,7 @@ Payload related configs, that can be leveraged to control 
merges based on specif
 
 | Config Name                                                      | Default   
                                                   | Description                
                                                                                
                                                                                
                                                                                
                                     |
 | ---------------------------------------------------------------- | 
------------------------------------------------------------ | 
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 |
-| [hoodie.compaction.payload.class](#hoodiecompactionpayloadclass) | 
org.apache.hudi.common.model.OverwriteWithLatestAvroPayload  | This needs to be 
same as class used during insert/upserts. Just like writing, compaction also 
uses the record payload class to merge records in the log against each other, 
merge again with the base file and produce the final record to be written after 
compaction.<br />`Config Param: PAYLOAD_CLASS_NAME` |
+| [hoodie.compaction.payload.class](#hoodiecompactionpayloadclass) | 
org.apache.hudi.common.model.DefaultHoodieRecordPayload      | This needs to be 
same as class used during insert/upserts. Just like writing, compaction also 
uses the record payload class to merge records in the log against each other, 
merge again with the base file and produce the final record to be written after 
compaction.<br />`Config Param: PAYLOAD_CLASS_NAME` |
 | [hoodie.payload.event.time.field](#hoodiepayloadeventtimefield)  | ts        
                                                   | Table column/field name to 
derive timestamp associated with the records. This canbe useful for e.g, 
determining the freshness of the table.<br />`Config Param: EVENT_TIME_FIELD`   
                                                                                
                                            |
 | [hoodie.payload.ordering.field](#hoodiepayloadorderingfield)     | ts        
                                                   | Table column/field name to 
order records that have the same key, before merging and writing to storage.<br 
/>`Config Param: ORDERING_FIELD`                                                
                                                                                
                                     |
 ---

(hudi) branch asf-site updated: [HUDI-6854][DOCS] Change default payload type to HOODIE_AVRO_DEFAULT (#11551)

Reply via email to