This is an automated email from the ASF dual-hosted git repository.

urfree pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/pulsar-site.git


The following commit(s) were added to refs/heads/main by this push:
     new 985536614ca Docs sync done from apache/pulsar(#9529850)
985536614ca is described below

commit 985536614cab66c1a4c44604ef3609107bfb5067
Author: Pulsar Site Updater <[email protected]>
AuthorDate: Thu Sep 1 12:00:53 2022 +0000

    Docs sync done from apache/pulsar(#9529850)
---
 site2/website-next/docs/cookbooks-deduplication.md | 25 +++++++++++++---------
 site2/website-next/docs/io-elasticsearch-sink.md   |  1 +
 2 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/site2/website-next/docs/cookbooks-deduplication.md 
b/site2/website-next/docs/cookbooks-deduplication.md
index 702679641d7..de607c9ee14 100644
--- a/site2/website-next/docs/cookbooks-deduplication.md
+++ b/site2/website-next/docs/cookbooks-deduplication.md
@@ -4,6 +4,7 @@ title: Message deduplication
 sidebar_label: "Message deduplication "
 ---
 
+
 ````mdx-code-block
 import Tabs from '@theme/Tabs';
 import TabItem from '@theme/TabItem';
@@ -12,11 +13,13 @@ import TabItem from '@theme/TabItem';
 
 When **Message deduplication** is enabled, it ensures that each message 
produced on Pulsar topics is persisted to disk *only once*, even if the message 
is produced more than once. Message deduplication is handled automatically on 
the server side. 
 
-To use message deduplication in Pulsar, you need to configure your Pulsar 
brokers and clients.
+Message deduplication could affect the performance of the brokers during 
informational snapshots.
+
+To use message deduplication in Pulsar, you need to configure your Pulsar 
brokers, namespaces, or topics. It is recommended to modify the configuration 
in the clients, for example, setting send timeout to infinity.
 
 ## How it works
 
-You can enable or disable message deduplication at the namespace level or the 
topic level. By default, it is disabled on all namespaces or topics. You can 
enable it in the following ways:
+You can enable or disable message deduplication at broker, namespace, or topic 
level. By default, it is disabled on all brokers, namespaces, or topics. You 
can enable it in the following ways:
 
 * Enable deduplication for all namespaces/topics at the broker-level.
 * Enable deduplication for a specific namespace with the `pulsar-admin 
namespaces` interface.
@@ -40,7 +43,7 @@ By default, message deduplication is *disabled* on all Pulsar 
namespaces/topics.
 
 Even if you set the value for `brokerDeduplicationEnabled`, enabling or 
disabling via Pulsar admin CLI overrides the default settings at the 
broker-level.
 
-### Enable message deduplication
+### Enable message deduplication at namespace or topic level
 
 Though message deduplication is disabled by default at the broker level, you 
can enable message deduplication for a specific namespace or topic using the 
[`pulsar-admin namespaces set-deduplication`](/tools/pulsar-admin/) or the 
[`pulsar-admin topics set-deduplication`](/tools/pulsar-admin/) command. You 
can use the `--enable`/`-e` flag and specify the namespace/topic. 
 
@@ -54,7 +57,7 @@ $ bin/pulsar-admin namespaces set-deduplication \
 
 ```
 
-### Disable message deduplication
+### Disable message deduplication at namespace or topic level
 
 Even if you enable message deduplication at the broker level, you can disable 
message deduplication for a specific namespace or topic using the 
[`pulsar-admin namespace set-deduplication`](/tools/pulsar-admin/) or the 
[`pulsar-admin topics set-deduplication`](/tools/pulsar-admin/) command. Use 
the `--disable`/`-d` flag and specify the namespace/topic.
 
@@ -70,7 +73,9 @@ $ bin/pulsar-admin namespaces set-deduplication \
 
 ## Pulsar clients
 
-If you enable message deduplication in Pulsar brokers, you need complete the 
following tasks for your client producers:
+If you enable message deduplication in Pulsar brokers, namespaces, or topics, 
it is recommended to make the client retry infinitely the messages until it 
succeeds, otherwise it is possible to break the ordering guarantee as some 
requests may time out and the application does not know whether the request is 
successfully added to the topic or not. 
+
+So you need to complete the following tasks for your client producers:
 
 1. Specify a name for the producer.
 1. Set the message timeout to `0` (namely, no timeout).
@@ -83,7 +88,7 @@ The instructions for Java, Python, and C++ clients are 
different.
   values={[{"label":"Java clients","value":"Java clients"},{"label":"Python 
clients","value":"Python clients"},{"label":"C++ clients","value":"C++ 
clients"}]}>
 <TabItem value="Java clients">
 
-To enable message deduplication on a [Java 
producer](client-libraries-java#producer), set the producer name using the 
`producerName` setter, and set the timeout to `0` using the `sendTimeout` 
setter. 
+To ensure the guarantee order on a [Java 
producer](client-libraries-java.md#producers) sending to a topic with message 
deduplication enabled, set the producer name using the `producerName` setter, 
and set the timeout to `0` using the `sendTimeout` setter. 
 
 ```java
 
@@ -105,7 +110,7 @@ Producer producer = pulsarClient.newProducer()
 </TabItem>
 <TabItem value="Python clients">
 
-To enable message deduplication on a [Python 
producer](client-libraries-python#producer), set the producer name using 
`producer_name`, and set the timeout to `0` using `send_timeout_millis`. 
+Not to break the guarantee order on a [Python 
producer](client-libraries-python.md#producers) sending to a topic with message 
deduplication active, set the producer name using `producer_name`, and set the 
timeout to `0` using `send_timeout_millis`. 
 
 ```python
 
@@ -121,8 +126,7 @@ producer = client.create_producer(
 
 </TabItem>
 <TabItem value="C++ clients">
-
-To enable message deduplication on a [C++ 
producer](client-libraries-cpp/#create-a-producer), set the producer name using 
`producer_name`, and set the timeout to `0` using `send_timeout_millis`. 
+Not to break the guarantee order on a [C++ 
producer](client-libraries-cpp.md#producer) sending to a topic with message 
deduplication active, set the producer name using `producer_name`, and set the 
timeout to `0` using `send_timeout_millis`. 
 
 ```cpp
 
@@ -147,4 +151,5 @@ Result result = client.createProducer(topic, 
producerConfig, producer);
 </TabItem>
 
 </Tabs>
-````
\ No newline at end of file
+````
+
diff --git a/site2/website-next/docs/io-elasticsearch-sink.md 
b/site2/website-next/docs/io-elasticsearch-sink.md
index 88f7fabb9b5..04e195a776b 100644
--- a/site2/website-next/docs/io-elasticsearch-sink.md
+++ b/site2/website-next/docs/io-elasticsearch-sink.md
@@ -89,6 +89,7 @@ The configuration of the Elasticsearch sink connector has the 
following properti
 | `canonicalKeyFields` | Boolean | false | false | Whether to sort the key 
fields for JSON and Avro or not. If it is set to `true` and the record key 
schema is `JSON` or `AVRO`, the serialized object does not consider the order 
of properties. |
 | `stripNonPrintableCharacters` | Boolean| false | true| Whether to remove all 
non-printable characters from the document or not. If it is set to true, all 
non-printable characters are removed from the document. |
 | `idHashingAlgorithm` | enum(NONE,SHA256,SHA512)| false | NONE|Hashing 
algorithm to use for the document id. This is useful in order to be compliant 
with the ElasticSearch _id hard limit of 512 bytes. |
+| `copyKeyFields` | Boolean | false | false |If the message key schema is AVRO 
or JSON, the message key fields are copied into the ElasticSearch document. |
 
 ### Definition of ElasticSearchSslConfig structure:
 

Reply via email to