tuteng commented on a change in pull request #5197: [Doc] Update *Debezium
Connector Guide*
URL: https://github.com/apache/pulsar/pull/5197#discussion_r325004765
##########
File path: site2/docs/io-cdc-debezium.md
##########
@@ -1,227 +1,281 @@
---
id: io-cdc-debezium
-title: CDC Debezium Connector
-sidebar_label: CDC Debezium Connector
+title: Debezium Connector
+sidebar_label: Debezium Connector
---
-### Source Configuration Options
+The Debezium source connector pulls messages from MySQL or PostgreSQL to
Pulsar topics.
-The Configuration is mostly related to Debezium task config, besides this we
should provides the service URL of Pulsar cluster, and topic names that used to
store offset and history.
+This guide explains how to congifure and use Debezium source connector.
+
+## Configuration
+
+The configuration of Debezium source connector has the following parameters.
| Name | Required | Default | Description |
|------|----------|---------|-------------|
-| `task.class` | `true` | `null` | A source task class that implemented in
Debezium. |
-| `database.hostname` | `true` | `null` | The address of the Database server. |
-| `database.port` | `true` | `null` | The port number of the Database server..
|
-| `database.user` | `true` | `null` | The name of the Database user that has
the required privileges. |
-| `database.password` | `true` | `null` | The password for the Database user
that has the required privileges. |
-| `database.server.id` | `true` | `null` | The connector’s identifier that
must be unique within the Database cluster and similar to Database’s server-id
configuration property. |
-| `database.server.name` | `true` | `null` | The logical name of the Database
server/cluster, which forms a namespace and is used in all the names of the
Kafka topics to which the connector writes, the Kafka Connect schema names, and
the namespaces of the corresponding Avro schema when the Avro Connector is
used. |
-| `database.whitelist` | `false` | `null` | A list of all databases hosted by
this server that this connector will monitor. This is optional, and there are
other properties for listing the databases and tables to include or exclude
from monitoring. |
-| `key.converter` | `true` | `null` | The converter provided by Kafka Connect
to convert record key. |
-| `value.converter` | `true` | `null` | The converter provided by Kafka
Connect to convert record value. |
-| `database.history` | `true` | `null` | The name of the database history
class name. |
-| `database.history.pulsar.topic` | `true` | `null` | The name of the database
history topic where the connector will write and recover DDL statements. This
topic is for internal use only and should not be used by consumers. |
-| `database.history.pulsar.service.url` | `true` | `null` | Pulsar cluster
service url for history topic. |
-| `pulsar.service.url` | `true` | `null` | Pulsar cluster service url. |
-| `offset.storage.topic` | `true` | `null` | Record the last committed offsets
that the connector successfully completed. |
+| `task.class` | true | null | A source task class that implemented in
Debezium. |
+| `database.hostname` | true | null | The address of a database server. |
+| `database.port` | true | null | The port number of a database server.|
+| `database.user` | true | null | The name of a database user that has the
required privileges. |
+| `database.password` | true | null | The password for a database user that
has the required privileges. |
+| `database.server.id` | true | null | The connector’s identifier that must be
unique within a database cluster and similar to the database’s server-id
configuration property. |
+| `database.server.name` | true | null | The logical name of a database
server/cluster, which forms a namespace and it is used in all the names of
Kafka topics to which the connector writes, the Kafka Connect schema names, and
the namespaces of the corresponding Avro schema when the Avro Connector is
used. |
+| `database.whitelist` | false | null | A list of all databases hosted by this
server which is monitored by the connector.<br/><br/> This is optional, and
there are other properties for listing databases and tables to include or
exclude from monitoring. |
+| `key.converter` | true | null | The converter provided by Kafka Connect to
convert record key. |
+| `value.converter` | true | null | The converter provided by Kafka Connect to
convert record value. |
+| `database.history` | true | null | The name of the database history class. |
+| `database.history.pulsar.topic` | true | null | The name of the database
history topic where the connector writes and recovers DDL statements.
<br/><br/>**Note: this topic is for internal use only and should not be used by
consumers.** |
+| `database.history.pulsar.service.url` | true | null | Pulsar cluster service
URL for history topic. |
+| `pulsar.service.url` | true | null | Pulsar cluster service URL. |
+| `offset.storage.topic` | true | null | Record the last committed offsets
that the connector successfully completes. |
## Example of MySQL
-We need to create a configuration file before using the Pulsar Debezium
connector.
-
-### Configuration
+You need to create a configuration file before using the Pulsar Debezium
connector.
+
+### Configuration
+
+You can use one of the following methods to create a configuration file.
+
+* JSON
+
+ ```json
+ {
+ "database.hostname": "localhost",
+ "database.port": "3306",
+ "database.user": "debezium",
+ "database.password": "dbz",
+ "database.server.id": "184054",
+ "database.server.name": "dbserver1",
+ "database.whitelist": "inventory",
+ "database.history":
"org.apache.pulsar.io.debezium.PulsarDatabaseHistory",
+ "database.history.pulsar.topic": "history-topic",
+ "database.history.pulsar.service.url": "pulsar://127.0.0.1:6650",
+ "key.converter": "org.apache.kafka.connect.json.JsonConverter",
+ "value.converter": "org.apache.kafka.connect.json.JsonConverter",
+ "pulsar.service.url": "pulsar://127.0.0.1:6650",
+ "offset.storage.topic": "offset-topic"
+ }
+ ```
-Here is a JSON configuration example:
-
-```json
-{
- "database.hostname": "localhost",
- "database.port": "3306",
- "database.user": "debezium",
- "database.password": "dbz",
- "database.server.id": "184054",
- "database.server.name": "dbserver1",
- "database.whitelist": "inventory",
- "database.history": "org.apache.pulsar.io.debezium.PulsarDatabaseHistory",
- "database.history.pulsar.topic": "history-topic",
- "database.history.pulsar.service.url": "pulsar://127.0.0.1:6650",
- "key.converter": "org.apache.kafka.connect.json.JsonConverter",
- "value.converter": "org.apache.kafka.connect.json.JsonConverter",
- "pulsar.service.url": "pulsar://127.0.0.1:6650",
- "offset.storage.topic": "offset-topic"
-}
-```
-
-Optionally, you can create a `debezium-mysql-source-config.yaml` file, and
copy the [contents]
(https://github.com/apache/pulsar/blob/master/pulsar-io/debezium/mysql/src/main/resources/debezium-mysql-source-config.yaml)
below to the `debezium-mysql-source-config.yaml` file.
-
-```$yaml
-tenant: "public"
-namespace: "default"
-name: "debezium-mysql-source"
-topicName: "debezium-mysql-topic"
-archive: "connectors/pulsar-io-debezium-mysql-{{pulsar:version}}.nar"
-
-parallelism: 1
-
-configs:
- ## config for mysql, docker image: debezium/example-mysql:0.8
- database.hostname: "localhost"
- database.port: "3306"
- database.user: "debezium"
- database.password: "dbz"
- database.server.id: "184054"
- database.server.name: "dbserver1"
- database.whitelist: "inventory"
-
- database.history: "org.apache.pulsar.io.debezium.PulsarDatabaseHistory"
- database.history.pulsar.topic: "history-topic"
- database.history.pulsar.service.url: "pulsar://127.0.0.1:6650"
- ## KEY_CONVERTER_CLASS_CONFIG, VALUE_CONVERTER_CLASS_CONFIG
- key.converter: "org.apache.kafka.connect.json.JsonConverter"
- value.converter: "org.apache.kafka.connect.json.JsonConverter"
- ## PULSAR_SERVICE_URL_CONFIG
- pulsar.service.url: "pulsar://127.0.0.1:6650"
- ## OFFSET_STORAGE_TOPIC_CONFIG
- offset.storage.topic: "offset-topic"
-```
+* YAML
+
+ You can create a `debezium-mysql-source-config.yaml` file and copy the
[contents](https://github.com/apache/pulsar/blob/master/pulsar-io/debezium/mysql/src/main/resources/debezium-mysql-source-config.yaml)
below to the `debezium-mysql-source-config.yaml` file.
+
+ ```yaml
+ tenant: "public"
+ namespace: "default"
+ name: "debezium-mysql-source"
+ topicName: "debezium-mysql-topic"
+ archive: "connectors/pulsar-io-debezium-mysql-{{pulsar:version}}.nar"
+ parallelism: 1
+
+ configs:
+
+ ## config for mysql, docker image: debezium/example-mysql:0.8
+ database.hostname: "localhost"
+ database.port: "3306"
+ database.user: "debezium"
+ database.password: "dbz"
+ database.server.id: "184054"
+ database.server.name: "dbserver1"
+ database.whitelist: "inventory"
+ database.history: "org.apache.pulsar.io.debezium.PulsarDatabaseHistory"
+ database.history.pulsar.topic: "history-topic"
+ database.history.pulsar.service.url: "pulsar://127.0.0.1:6650"
+
+ ## KEY_CONVERTER_CLASS_CONFIG, VALUE_CONVERTER_CLASS_CONFIG
+ key.converter: "org.apache.kafka.connect.json.JsonConverter"
+ value.converter: "org.apache.kafka.connect.json.JsonConverter"
+
+ ## PULSAR_SERVICE_URL_CONFIG
+ pulsar.service.url: "pulsar://127.0.0.1:6650"
+
+ ## OFFSET_STORAGE_TOPIC_CONFIG
+ offset.storage.topic: "offset-topic"
Review comment:
There seems to be a loss of indentation here.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services