abhishekagarwal87 commented on code in PR #12842:
URL: https://github.com/apache/druid/pull/12842#discussion_r934523130
##########
docs/development/extensions-core/kafka-extraction-namespace.md:
##########
@@ -32,28 +32,56 @@ If you need updates to populate as promptly as possible, it
is possible to plug
{
"type":"kafka",
"kafkaTopic":"testTopic",
- "kafkaProperties":{"zookeeper.connect":"somehost:2181/kafka"}
+ "kafkaProperties":{
+ "bootstrap.servers":"kafka.service:9092"
+ }
}
```
-|Parameter|Description|Required|Default|
-|---------|-----------|--------|-------|
-|`kafkaTopic`|The Kafka topic to read the data from|Yes||
-|`kafkaProperties`|Kafka consumer properties. At least"zookeeper.connect" must
be specified. Only the zookeeper connector is supported|Yes||
-|`connectTimeout`|How long to wait for an initial connection|No|`0` (do not
wait)|
-|`isOneToOne`|The map is a one-to-one (see [Lookup
DimensionSpecs](../../querying/dimensionspecs.md))|No|`false`|
+| Parameter | Description
| Required | Default |
+|-------------------|-----------------------------------------------------------------------------------------|----------|-------------------|
+| `kafkaTopic` | The Kafka topic to read the data from
| Yes ||
+| `kafkaProperties` | Kafka consumer properties (`bootstrap.servers` must be
specified) | Yes ||
+| `connectTimeout` | How long to wait for an initial connection
| No | `0` (do not wait) |
+| `isOneToOne` | The map is a one-to-one (see [Lookup
DimensionSpecs](../../querying/dimensionspecs.md)) | No | `false`
|
-The extension `kafka-extraction-namespace` enables reading from a Kafka feed
which has name/key pairs to allow renaming of dimension values. An example use
case would be to rename an ID to a human readable format.
+The extension `kafka-extraction-namespace` enables reading from an [Apache
Kafka](https://kafka.apache.org/) topic which has name/key pairs to allow
renaming of dimension values. An example use case would be to rename an ID to a
human-readable format.
-The consumer properties `group.id` and `auto.offset.reset` CANNOT be set in
`kafkaProperties` as they are set by the extension as
`UUID.randomUUID().toString()` and `smallest` respectively.
+## How it Works
-See [lookups](../../querying/lookups.md) for how to configure and use lookups.
+The extractor works by consuming the configured Kafka topic from the
beginning, and appending every record to an internal map. The key of the Kafka
record is used as they key of the map, and the payload of the record is used as
the value. At query time, a lookup can be used to transform the key into the
associated value. See [lookups](../../querying/lookups.md) for how to configure
and use lookups in a query. Keys and values are both stored as strings by the
lookup extractor.
+
+The extractor remains subscribed to the topic, so new records are added to the
lookup map as they appear. This allows for lookup values to be updated in
near-realtime. If two records are added to the topic with the same key, the
record with the larger offset will replace the previous record in the lookup
map. A record with a `null` payload will be treated as a tombstone record, and
the associated key will be removed from the lookup map.
Review Comment:
the PR, that has the code to remove a payload with `null` message, is yet to
be merged.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]