geserdugarov commented on code in PR #10856:
URL: https://github.com/apache/hudi/pull/10856#discussion_r1522896072
##########
website/docs/basic_configurations.md:
##########
@@ -101,15 +102,14 @@ Flink jobs using the SQL can be configured through the
options in WITH clause. T
| [hoodie.database.name](#hoodiedatabasename)
| (N/A) | Database name to register
to Hive metastore<br /> `Config Param: DATABASE_NAME`
|
| [hoodie.table.name](#hoodietablename)
| (N/A) | Table name to register to
Hive metastore<br /> `Config Param: TABLE_NAME`
|
| [path](#path)
| (N/A) | Base path for the target
hoodie table. The path would be created if it does not exist, otherwise a
Hoodie table expects to be initialized successfully<br /> `Config Param: PATH`
|
+| [read.commits.limit](#readcommitslimit)
| (N/A) | The maximum number of
commits allowed to read in each instant check, if it is streaming read, the avg
read instants number per-second would be
'read.commits.limit'/'read.streaming.check-interval', by default no limit<br />
`Config Param: READ_COMMITS_LIMIT`
|
| [read.end-commit](#readend-commit)
| (N/A) | End commit instant for
reading, the commit time format should be 'yyyyMMddHHmmss'<br /> `Config Param:
READ_END_COMMIT`
|
| [read.start-commit](#readstart-commit)
| (N/A) | Start commit instant for
reading, the commit time format should be 'yyyyMMddHHmmss', by default reading
from the latest instant for streaming read<br /> `Config Param:
READ_START_COMMIT`
|
| [archive.max_commits](#archivemax_commits)
| 50 | Max number of commits to
keep before archiving older commits into a sequential log, default 50<br />
`Config Param: ARCHIVE_MAX_COMMITS`
|
| [archive.min_commits](#archivemin_commits)
| 40 | Min number of commits to
keep before archiving older commits into a sequential log, default 40<br />
`Config Param: ARCHIVE_MIN_COMMITS`
|
| [cdc.enabled](#cdcenabled)
| false | When enable, persist the
change data if necessary, and can be queried as a CDC query mode<br /> `Config
Param: CDC_ENABLED`
|
| [cdc.supplemental.logging.mode](#cdcsupplementalloggingmode)
| DATA_BEFORE_AFTER | Setting 'op_key_only'
persists the 'op' and the record key only, setting 'data_before' persists the
additional 'before' image, and setting 'data_before_after' persists the
additional 'before' and 'after' images.<br /> `Config Param:
SUPPLEMENTAL_LOGGING_MODE`
|
| [changelog.enabled](#changelogenabled)
| false | Whether to keep all the
intermediate changes, we try to keep all the changes of a record when enabled:
1). The sink accept the UPDATE_BEFORE message; 2). The source try to emit every
changes of a record. The semantics is best effort because the compaction job
would finally merge all changes of a record into one. default false to have
UPSERT semantics<br /> `Config Param: CHANGELOG_ENABLED` |
-| [clean.async.enabled](#cleanasyncenabled)
| true | Whether to cleanup the
old commits immediately on new commits, enabled by default<br /> `Config Param:
CLEAN_ASYNC_ENABLED`
|
-| [clean.retain_commits](#cleanretain_commits)
| 30 | Number of commits to
retain. So data will be retained for num_of_commits * time_between_commits
(scheduled). This also directly translates into how much you can incrementally
pull on this table, default 30<br /> `Config Param: CLEAN_RETAIN_COMMITS`
|
Review Comment:
Yes, both files `website/docs/basic_configurations.md` and
`website/docs/configurations.md` were generated using
[`hudi-utils/generate_config.sh`](https://github.com/apache/hudi/blob/asf-site/hudi-utils/generate_config.sh)
in the `asf-site` branch.
The process is described in
[`hudi-utils/README.md`](https://github.com/apache/hudi/blob/asf-site/hudi-utils/README.md).
I've used `1.0.0-SNAPSHOT` version to generate the files.
Specifically, these two removed lines are corresponding to the open [MR
10851](https://github.com/apache/hudi/pull/10851), and should be merged only if
the code changing MR will be merged.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]