This is an automated email from the ASF dual-hosted git repository.
sivabalan pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/hudi.git
The following commit(s) were added to refs/heads/asf-site by this push:
new 82db2d55e6 [HUDI-3928][HUDI-3932] Adding docs for 0.11 release
(savepoint restore to CLI, pulsar commit callback, hive schema provider) (#5429)
82db2d55e6 is described below
commit 82db2d55e6069ced65c0074776b510deed4030ac
Author: Sivabalan Narayanan <[email protected]>
AuthorDate: Thu Apr 28 21:17:40 2022 -0400
[HUDI-3928][HUDI-3932] Adding docs for 0.11 release (savepoint restore to
CLI, pulsar commit callback, hive schema provider) (#5429)
---
website/docs/cli.md | 29 +++++++++++++++++++++++++++++
website/docs/disaster_recovery.md | 2 +-
website/docs/faq.md | 6 ++++++
website/docs/hoodie_deltastreamer.md | 12 ++++++++++++
website/docs/writing_data.md | 18 ++++++++++++++++++
website/learn/faq.md | 6 ++++++
6 files changed, 72 insertions(+), 1 deletion(-)
diff --git a/website/docs/cli.md b/website/docs/cli.md
index 2b9bb60ba8..c971181545 100644
--- a/website/docs/cli.md
+++ b/website/docs/cli.md
@@ -391,3 +391,32 @@ hudi:stock_ticks_mor->compaction repair --instant
20181005222611
Compaction successfully repaired
.....
```
+
+## Savepoint and Restore
+As the name suggest, "savepoint" saves the table as of the commit time, so
that it lets you restore the table to this
+savepoint at a later point in time if need be. You can read more about
savepoints and restore [here](/docs/next/disaster_recovery)
+
+To trigger savepoint for a hudi table
+```java
+connect --path /tmp/hudi_trips_cow/
+commits show
+set --conf SPARK_HOME=<SPARK_HOME>
+savepoint create --commit 20220128160245447 --sparkMaster local[2]
+```
+
+To restore the table to one of the savepointed commit:
+
+```java
+connect --path /tmp/hudi_trips_cow/
+commits show
+set --conf SPARK_HOME=<SPARK_HOME>
+savepoints show
+╔═══════════════════╗
+║ SavepointTime ║
+╠═══════════════════╣
+║ 20220128160245447 ║
+╚═══════════════════╝
+savepoint rollback --savepoint 20220128160245447 --sparkMaster local[2]
+```
+
+
diff --git a/website/docs/disaster_recovery.md
b/website/docs/disaster_recovery.md
index aee1388436..c2f53bc8cd 100644
--- a/website/docs/disaster_recovery.md
+++ b/website/docs/disaster_recovery.md
@@ -1,5 +1,5 @@
---
-title: Disaster Recovery with Apache Hudi
+title: Disaster Recovery
toc: true
---
diff --git a/website/docs/faq.md b/website/docs/faq.md
index cee9e583e5..be340e3b38 100644
--- a/website/docs/faq.md
+++ b/website/docs/faq.md
@@ -491,6 +491,12 @@ But manually changing it will result in checksum errors.
So, we have to go via h
1. connect --path hudi_table_path
2. repair overwrite-hoodie-props --new-props-file new_hoodie.properties
+### Can I get notified when new commits happen in my Hudi table?
+
+Yes. Hudi provides the ability to post a callback notification about a write
commit. You can use a http hook or choose to
+be notified via a Kafka/pulsar topic or plug in your own implementation to get
notified. Please refer
[here](https://hudi.apache.org/docs/next/writing_data/#commit-notifications)
+for details
+
## Contributing to FAQ
A good and usable FAQ should be community-driven and crowd source
questions/thoughts across everyone.
diff --git a/website/docs/hoodie_deltastreamer.md
b/website/docs/hoodie_deltastreamer.md
index 3c49bd2bbf..ae87c579cd 100644
--- a/website/docs/hoodie_deltastreamer.md
+++ b/website/docs/hoodie_deltastreamer.md
@@ -265,6 +265,18 @@ You can use a .avsc file to define your schema. You can
then point to this file
|hoodie.deltastreamer.schemaprovider.source.schema.file|The schema of the
source you are reading from|[example schema
file](https://github.com/apache/hudi/blob/a8fb69656f522648233f0310ca3756188d954281/docker/demo/config/test-suite/source.avsc)|
|hoodie.deltastreamer.schemaprovider.target.schema.file|The schema of the
target you are writing to|[example schema
file](https://github.com/apache/hudi/blob/a8fb69656f522648233f0310ca3756188d954281/docker/demo/config/test-suite/target.avsc)|
+
+### Hive Schema Provider
+You can use hive tables to fetch source and target schema.
+
+|Config| Description |
+|---|-------------------------------------------------------|
+|hoodie.deltastreamer.schemaprovider.source.schema.hive.database| Hive
database from where source schema can be fetched |
+|hoodie.deltastreamer.schemaprovider.source.schema.hive.table| Hive table from
where source schema can be fetched |
+|hoodie.deltastreamer.schemaprovider.target.schema.hive.database| Hive
database from where target schema can be fetched |
+|hoodie.deltastreamer.schemaprovider.target.schema.hive.table| Hive table from
where target schema can be fetched |
+
+
### Schema Provider with Post Processor
The SchemaProviderWithPostProcessor, will extract the schema from one of the
previously mentioned Schema Providers and
then will apply a post processor to change the schema before it is used. You
can write your own post processor by extending
diff --git a/website/docs/writing_data.md b/website/docs/writing_data.md
index 8765222b21..3c0a516e2c 100644
--- a/website/docs/writing_data.md
+++ b/website/docs/writing_data.md
@@ -446,6 +446,24 @@ You can push a commit notification to a Kafka topic so it
can be used by other r
| BOOTSTRAP_SERVERS | Bootstrap servers of kafka cluster, to be used for
publishing commit metadata | required | N/A |
| | | | |
+#### Pulsar Endpoints
+You can push a commit notification to a Pulsar topic so it can be used by
other real time systems.
+
+| Config | Description
| Required | Default |
+| -----------
|-----------------------------------------------------------------------------|
------- |--------|
+| hoodie.write.commit.callback.pulsar.broker.service.url | Server's Url of
pulsar cluster to use to publish commit metadata. | required | N/A |
+| hoodie.write.commit.callback.pulsar.topic | Pulsar topic name to publish
timeline activity into | required | N/A |
+| hoodie.write.commit.callback.pulsar.producer.route-mode | Message routing
logic for producers on partitioned topics. | optional | RoundRobinPartition
|
+| hoodie.write.commit.callback.pulsar.producer.pending-queue-size | The
maximum size of a queue holding pending messages. | optional |
1000 |
+| hoodie.write.commit.callback.pulsar.producer.pending-total-size | The
maximum number of pending messages across partitions. | required | 50000 |
+| hoodie.write.commit.callback.pulsar.producer.block-if-queue-full | When the
queue is full, the method is blocked instead of an exception is thrown. |
optional | true |
+| hoodie.write.commit.callback.pulsar.producer.send-timeout | The timeout in
each sending to pulsar. | optional | 30s |
+| hoodie.write.commit.callback.pulsar.operation-timeout | Duration of waiting
for completing an operation. | optional | 30s |
+| hoodie.write.commit.callback.pulsar.connection-timeout | Duration of waiting
for a connection to a broker to be established. | optional | 10s |
+| hoodie.write.commit.callback.pulsar.request-timeout | Duration of waiting
for completing a request. | optional | 60s |
+| hoodie.write.commit.callback.pulsar.keepalive-interval | Duration of keeping
alive interval for each client broker connection. | optional | 30s |
+| |
| | |
+
#### Bring your own implementation
You can extend the HoodieWriteCommitCallback class to implement your own way
to asynchronously handle the callback
of a successful write. Use this public API:
diff --git a/website/learn/faq.md b/website/learn/faq.md
index ab6501a88b..3ab46888d5 100644
--- a/website/learn/faq.md
+++ b/website/learn/faq.md
@@ -513,6 +513,12 @@ But manually changing it will result in checksum errors.
So, we have to go via h
1. connect --path hudi_table_path
2. repair overwrite-hoodie-props --new-props-file new_hoodie.properties
+### Can I get notified when new commits happen in my Hudi table?
+
+Yes. Hudi provides the ability to post a callback notification about a write
commit. You can use a http hook or choose to
+be notified via a Kafka/pulsar topic or plug in your own implementation to get
notified. Please refer
[here](https://hudi.apache.org/docs/next/writing_data/#commit-notifications)
+for details
+
## Contributing to FAQ
A good and usable FAQ should be community-driven and crowd source
questions/thoughts across everyone.