This is an automated email from the ASF dual-hosted git repository.
xushiyan pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/hudi.git
The following commit(s) were added to refs/heads/asf-site by this push:
new aeca06472c9a chore(site): update FAQ ref links (#14123)
aeca06472c9a is described below
commit aeca06472c9a88c9f7914d174ef029d165a539df
Author: Shiyan Xu <[email protected]>
AuthorDate: Tue Oct 21 20:11:08 2025 -0500
chore(site): update FAQ ref links (#14123)
---
website/docs/quick-start-guide.md | 2 +-
website/docs/record_merger.md | 2 +-
website/docs/writing_data.md | 2 +-
website/src/pages/faq/design_and_concepts.md | 2 +-
website/src/pages/faq/storage.md | 2 +-
website/src/pages/faq/writing_tables.md | 4 ++--
website/versioned_docs/version-0.14.1/faq.md | 14 +++++++-------
website/versioned_docs/version-0.15.0/faq.md | 12 ++++++------
8 files changed, 20 insertions(+), 20 deletions(-)
diff --git a/website/docs/quick-start-guide.md
b/website/docs/quick-start-guide.md
index 579b2750907a..388482281f6a 100644
--- a/website/docs/quick-start-guide.md
+++ b/website/docs/quick-start-guide.md
@@ -1305,7 +1305,7 @@ transformation support, automatic table services and so
on.
**Structured Streaming** - Hudi supports Spark Structured Streaming reads and
writes as well. Please see
[here](writing_tables_streaming_writes#spark-streaming) for more.
-Check out more information on [modeling data in
Hudi](faq_general#how-do-i-model-the-data-stored-in-hudi) and different ways to
perform [batch writes](/docs/writing_data) and [streaming
writes](writing_tables_streaming_writes).
+Check out more information on [modeling data in
Hudi](faq/general#how-do-i-model-the-data-stored-in-hudi) and different ways to
perform [batch writes](/docs/writing_data) and [streaming
writes](writing_tables_streaming_writes).
### Dockerized Demo
Even as we showcased the core capabilities, Hudi supports a lot more advanced
functionality that can make it easy
diff --git a/website/docs/record_merger.md b/website/docs/record_merger.md
index 5dfc70f08e78..4c47dbde1b6a 100644
--- a/website/docs/record_merger.md
+++ b/website/docs/record_merger.md
@@ -249,7 +249,7 @@ Payload class can be specified using the below configs. For
more advanced config
There are also quite a few other implementations. Developers may be interested
in looking at the hierarchy of `HoodieRecordPayload` interface. For
example,
[`MySqlDebeziumAvroPayload`](https://github.com/apache/hudi/blob/e76dd102bcaf8aec5a932e7277ccdbfd73ce1a32/hudi-common/src/main/java/org/apache/hudi/common/model/debezium/MySqlDebeziumAvroPayload.java)
and
[`PostgresDebeziumAvroPayload`](https://github.com/apache/hudi/blob/e76dd102bcaf8aec5a932e7277ccdbfd73ce1a32/hudi-common/src/main/java/org/apache/hudi/common/model/debezium/PostgresDebeziumAvroPayload.java)
provides support for seamlessly applying changes
captured via Debezium for MySQL and PostgresDB.
[`AWSDmsAvroPayload`](https://github.com/apache/hudi/blob/e76dd102bcaf8aec5a932e7277ccdbfd73ce1a32/hudi-common/src/main/java/org/apache/hudi/common/model/AWSDmsAvroPayload.java)
provides support for applying changes captured via Amazon Database Migration
Service onto S3.
-For full configurations, go [here](/docs/configurations#RECORD_PAYLOAD) and
please check out [this
FAQ](faq_writing_tables/#can-i-implement-my-own-logic-for-how-input-records-are-merged-with-record-on-storage)
if you want to implement your own custom payloads.
+For full configurations, go [here](/docs/configurations#RECORD_PAYLOAD) and
please check out [this
FAQ](faq/writing_tables/#can-i-implement-my-own-logic-for-how-input-records-are-merged-with-record-on-storage)
if you want to implement your own custom payloads.
## Related Resources
diff --git a/website/docs/writing_data.md b/website/docs/writing_data.md
index 81462307a7f5..6d3272378e55 100644
--- a/website/docs/writing_data.md
+++ b/website/docs/writing_data.md
@@ -83,7 +83,7 @@ df.write.format("hudi").
You can check the data generated under
`/tmp/hudi_trips_cow/<region>/<country>/<city>/`. We provided a record key
(`uuid` in
[schema](https://github.com/apache/hudi/blob/6f9b02decb5bb2b83709b1b6ec04a97e4d102c11/hudi-spark-datasource/hudi-spark/src/main/java/org/apache/hudi/QuickstartUtils.java#L60)),
partition field (`region/country/city`) and combine logic (`ts` in
[schema](https://github.com/apache/hudi/blob/6f9b02decb5bb2b83709b1b6ec04a97e4d102c11/hudi-spark-datasource/hudi-spark/src/main/java/org/apache/hudi/QuickstartUtils.java#L60))
to ensure trip records are unique within each partition. For more info, refer
to
-[Modeling data stored in
Hudi](faq_general/#how-do-i-model-the-data-stored-in-hudi)
+[Modeling data stored in
Hudi](faq/general/#how-do-i-model-the-data-stored-in-hudi)
and for info on ways to ingest data into Hudi, refer to [Writing Hudi
Tables](/docs/hoodie_streaming_ingestion).
Here we are using the default write operation : `upsert`. If you have a
workload without updates, you can also issue
`insert` or `bulk_insert` operations which could be faster. To know more,
refer to [Write operations](/docs/write_operations)
diff --git a/website/src/pages/faq/design_and_concepts.md
b/website/src/pages/faq/design_and_concepts.md
index e2b2e619f613..6da73cefb728 100644
--- a/website/src/pages/faq/design_and_concepts.md
+++ b/website/src/pages/faq/design_and_concepts.md
@@ -49,7 +49,7 @@ To expand more on the long term approach, Hudi has had a
proposal to streamline/
This has been delayed for a few reasons
- Large hosted query engines and users not upgrading fast enough.
-- The issues brought up -
\[[1](faq_design_and_concepts#does-hudis-use-of-wall-clock-timestamp-for-instants-pose-any-clock-skew-issues),[2](faq_design_and_concepts#hudis-commits-are-based-on-transaction-start-time-instead-of-completed-time-does-this-cause-data-loss-or-inconsistency-in-case-of-incremental-and-time-travel-queries)\],
+- The issues brought up -
\[[1](faq/design_and_concepts#does-hudis-use-of-wall-clock-timestamp-for-instants-pose-any-clock-skew-issues),[2](faq/design_and_concepts#hudis-commits-are-based-on-transaction-start-time-instead-of-completed-time-does-this-cause-data-loss-or-inconsistency-in-case-of-incremental-and-time-travel-queries)\],
relevant to this are not practically very important to users beyond good
pedantic discussions,
- Wanting to do it alongside [non-blocking concurrency
control](https://github.com/apache/hudi/pull/7907) in Hudi version 1.x.
diff --git a/website/src/pages/faq/storage.md b/website/src/pages/faq/storage.md
index c6dcb6ad8c96..77119450539e 100644
--- a/website/src/pages/faq/storage.md
+++ b/website/src/pages/faq/storage.md
@@ -19,7 +19,7 @@ More details can be found [here](/docs/concepts/) and also
[Design And Architect
### How do I migrate my data to Hudi?
-Hudi provides built in support for rewriting your entire table into Hudi
one-time using the HDFSParquetImporter tool available from the hudi-cli . You
could also do this via a simple read and write of the dataset using the Spark
datasource APIs. Once migrated, writes can be performed using normal means
discussed [here](faq_writing_tables#what-are-some-ways-to-write-a-hudi-table).
This topic is discussed in detail [here](/docs/migration_guide/), including
ways to doing partial migrations.
+Hudi provides built in support for rewriting your entire table into Hudi
one-time using the HDFSParquetImporter tool available from the hudi-cli . You
could also do this via a simple read and write of the dataset using the Spark
datasource APIs. Once migrated, writes can be performed using normal means
discussed [here](faq/writing_tables#what-are-some-ways-to-write-a-hudi-table).
This topic is discussed in detail [here](/docs/migration_guide/), including
ways to doing partial migrations.
### How to convert an existing COW table to MOR?
diff --git a/website/src/pages/faq/writing_tables.md
b/website/src/pages/faq/writing_tables.md
index 1c8ae20623cc..c2c30abeb807 100644
--- a/website/src/pages/faq/writing_tables.md
+++ b/website/src/pages/faq/writing_tables.md
@@ -73,11 +73,11 @@ GDPR has made deletes a must-have tool in everyone's data
management toolbox. Hu
### Should I need to worry about deleting all copies of the records in case of
duplicates?
-No. Hudi removes all the copies of a record key when deletes are issued. Here
is the long form explanation - Sometimes accidental user errors can lead to
duplicates introduced into a Hudi table by either [concurrent
inserts](faq_writing_tables#can-concurrent-inserts-cause-duplicates) or by [not
deduping the input
records](faq_writing_tables#can-single-writer-inserts-have-duplicates) for an
insert operation. However, using the right index (e.g., in the default [Simple
Index](https://githu [...]
+No. Hudi removes all the copies of a record key when deletes are issued. Here
is the long form explanation - Sometimes accidental user errors can lead to
duplicates introduced into a Hudi table by either [concurrent
inserts](faq/writing_tables#can-concurrent-inserts-cause-duplicates) or by [not
deduping the input
records](faq/writing_tables#can-single-writer-inserts-have-duplicates) for an
insert operation. However, using the right index (e.g., in the default [Simple
Index](https://githu [...]
### How does Hudi handle duplicate record keys in an input?
-When issuing an `upsert` operation on a table and the batch of records
provided contains multiple entries for a given key, then all of them are
reduced into a single final value by repeatedly calling payload class's
[preCombine()](https://github.com/apache/hudi/blob/d3edac4612bde2fa9deca9536801dbc48961fb95/hudi-common/src/main/java/org/apache/hudi/common/model/HoodieRecordPayload.java#L40)
method . By default, we pick the record with the greatest value (determined by
calling .compareTo() [...]
+When issuing an `upsert` operation on a table and the batch of records
provided contains multiple entries for a given key, then all of them are
reduced into a single final value by repeatedly calling payload class's
[preCombine()](https://github.com/apache/hudi/blob/d3edac4612bde2fa9deca9536801dbc48961fb95/hudi-common/src/main/java/org/apache/hudi/common/model/HoodieRecordPayload.java#L40)
method . By default, we pick the record with the greatest value (determined by
calling .compareTo() [...]
For an insert or bulk_insert operation, no such pre-combining is performed.
Thus, if your input contains duplicates, the table would also contain
duplicates. If you don't want duplicate records either issue an **upsert** or
consider specifying option to de-duplicate input in either datasource using
[`hoodie.datasource.write.insert.drop.duplicates`](/docs/configurations#hoodiedatasourcewriteinsertdropduplicates)
& [`hoodie.combine.before.insert`](/docs/configurations/#hoodiecombinebeforei
[...]
diff --git a/website/versioned_docs/version-0.14.1/faq.md
b/website/versioned_docs/version-0.14.1/faq.md
index b3629b4377ac..732108eeb6a9 100644
--- a/website/versioned_docs/version-0.14.1/faq.md
+++ b/website/versioned_docs/version-0.14.1/faq.md
@@ -6,10 +6,10 @@ keywords: [hudi, writing, reading]
The FAQs are split into following pages. Please refer to the specific pages
for more info.
-- [General](/docs/next/faq_general)
-- [Design & Concepts](/docs/next/faq_design_and_concepts)
-- [Writing Tables](/docs/next/faq_writing_tables)
-- [Querying Tables](/docs/next/faq_reading_tables)
-- [Table Services](/docs/next/faq_table_services)
-- [Storage](/docs/next/faq_storage)
-- [Integrations](/docs/next/faq_integrations)
+- [General](/docs/faq_general)
+- [Design & Concepts](/docs/faq_design_and_concepts)
+- [Writing Tables](/docs/faq_writing_tables)
+- [Querying Tables](/docs/faq_reading_tables)
+- [Table Services](/docs/faq_table_services)
+- [Storage](/docs/faq_storage)
+- [Integrations](/docs/faq_integrations)
diff --git a/website/versioned_docs/version-0.15.0/faq.md
b/website/versioned_docs/version-0.15.0/faq.md
index b12840189309..0a757470a4c4 100644
--- a/website/versioned_docs/version-0.15.0/faq.md
+++ b/website/versioned_docs/version-0.15.0/faq.md
@@ -7,9 +7,9 @@ keywords: [hudi, writing, reading]
The FAQs are split into following pages. Please refer to the specific pages
for more info.
- [General](faq_general)
-- [Design & Concepts](/docs/next/faq_design_and_concepts)
-- [Writing Tables](/docs/next/faq_writing_tables)
-- [Reading Tables](/docs/next/faq_reading_tables)
-- [Table Services](/docs/next/faq_table_services)
-- [Storage](/docs/next/faq_storage)
-- [Integrations](/docs/next/faq_integrations)
+- [Design & Concepts](/docs/faq_design_and_concepts)
+- [Writing Tables](/docs/faq_writing_tables)
+- [Reading Tables](/docs/faq_reading_tables)
+- [Table Services](/docs/faq_table_services)
+- [Storage](/docs/faq_storage)
+- [Integrations](/docs/faq_integrations)