merlimat closed pull request #1468: Documentation for non-persistent topics URL: https://github.com/apache/incubator-pulsar/pull/1468
This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/site/_data/config/broker.yaml b/site/_data/config/broker.yaml index 9cc9fb6d9b..3b0118083a 100644 --- a/site/_data/config/broker.yaml +++ b/site/_data/config/broker.yaml @@ -18,6 +18,12 @@ # configs: +- name: enablePersistentTopics + default: 'true' + description: Whether persistent topics are enabled on the broker +- name: enableNonPersistentTopics + default: 'true' + description: Whether non-persistent topics are enabled on the broker - name: functionsWorkerEnabled description: Whether the Pulsar Functions worker service is enabled in the broker default: 'false' diff --git a/site/_data/sidebar.yaml b/site/_data/sidebar.yaml index a4eeb0ba5c..68fe4263c4 100644 --- a/site/_data/sidebar.yaml +++ b/site/_data/sidebar.yaml @@ -134,6 +134,8 @@ groups: docs: - title: Message deduplication endpoint: message-deduplication + - title: Non-persistent messaging + endpoint: non-persistent-messaging - title: Partitioned topics endpoint: PartitionedTopics - title: Retention and expiry diff --git a/site/_includes/explanations/non-persistent-topics.md b/site/_includes/explanations/non-persistent-topics.md index 74ff6631cf..a94358e585 100644 --- a/site/_includes/explanations/non-persistent-topics.md +++ b/site/_includes/explanations/non-persistent-topics.md @@ -19,64 +19,10 @@ --> -{% include admonition.html type="success" title='Notice' content=" -This feature is still in experimental mode and implementation details may change in future release. -" %} +By default, Pulsar persistently stores *all* {% popover unacknowledged %} messages on multiple [BookKeeper](#persistent-storage) {% popover bookies %} (storage nodes). Data for messages on persistent topics can thus survive {% popover broker %} restarts and subscriber failover. -As name suggests, non-persist topic does not persist messages into any durable storage disk unlike persistent topic where messages are durably persisted on multiple disks. +Pulsar also, however, supports **non-persistent topics**, which are topics on which messages are *never* persisted to disk and live only in memory. When using non-persistent delivery, killing a Pulsar {% popover broker %} or disconnecting a subscriber to a topic means that all in-transit messages are lost on that (non-persistent) topic, meaning that clients may see message loss. -Therefore, if you are using persistent delivery, messages are persisted to disk/database so that they will survive a broker restart or subscriber failover. While using non-persistent delivery, if you kill a broker or subscriber is disconnected then subscriber will lose all in-transit messages. So, client may see message loss with non-persistent topic. +Non-persistent topics have names of this form (note the `non-persistent` in the name): -- In non-persistent topic, as soon as broker receives published message, it immediately delivers this message to all connected subscribers without persisting them into any storage. So, if subscriber gets disconnected with broker then broker will not be able to deliver those in-transit messages and subscribers will never be able to receive those messages again. Broker also drops a message for the consumer, if consumer does not have enough permit to consume message, or consumer TCP channel is not writable. Therefore, consumer receiver queue size (to accommodate enough permits) and TCP-receiver window size (to keep channel writable) should be configured properly to avoid message drop for that consumer. -- Broker only allows configured number of in-flight messages per client connection. So, if producer tries to publish messages higher than this rate, then broker silently drops those new incoming messages without processing and delivering them to the subscribers. However, broker acknowledges with special message-id (`msg-id: -1:-1`) for those dropped messages to signal producer about the message drop. - -#### Performance - -Non-persistent messaging is usually faster than persistent messaging because broker does not persist messages and immediately sends ack back to producer as soon as that message deliver to all connected subscribers. Therefore, producer sees comparatively low publish latency with non-persistent topic. - - -#### Client API - - -A topic name will look like: - -``` -non-persistent://my-property/us-west/my-namespace/my-topic -``` - -Producer and consumer can connect to non-persistent topic in a similar way, as persistent topic except topic name must start with `non-persistent`. - -Non-persistent topic supports all 3 different subscription-modes: **Exclusive**, **Shared**, **Failover** which are already explained in details at [GettingStarted](../../getting-started/ConceptsAndArchitecture). - - -##### Consumer API - -```java -PulsarClient client = PulsarClient.create("pulsar://localhost:6650"); - -Consumer consumer = client.subscribe( - "non-persistent://sample/standalone/ns1/my-topic", - "my-subscribtion-name"); -``` - -##### Producer API - -```java -PulsarClient client = PulsarClient.create("pulsar://localhost:6650"); - -Producer producer = client.createProducer( - "non-persistent://sample/standalone/ns1/my-topic"); -``` - -#### Broker configuration - -Sometimes, there would be a need to configure few dedicated brokers in a cluster, to just serve non-persistent topics. - -Broker configuration for enabling broker to own only configured type of topics - -``` -# It disables broker to load persistent topics -enablePersistentTopics=false -# It enables broker to load non-persistent topics -enableNonPersistentTopics=true -``` +{% include topic.html type="non-persistent" p="property" c="cluster" n="namespace" t="topic" %} \ No newline at end of file diff --git a/site/_includes/topic.html b/site/_includes/topic.html index 5a5789d354..00d89ba06d 100644 --- a/site/_includes/topic.html +++ b/site/_includes/topic.html @@ -19,5 +19,5 @@ --> <section class="topic"> - persistent://<span class="property">{{ include.p }}</span>/<span class="cluster">{{ include.c }}</span>/<span class="namespace">{{ include.n }}</span>/<span class="t">{{ include.t }}</span> + {% if include.type %}{{ include.type }}{% else %}persistent{% endif %}://<span class="property">{{ include.p }}</span>/<span class="cluster">{{ include.c }}</span>/<span class="namespace">{{ include.n }}</span>/<span class="t">{{ include.t }}</span> </section> diff --git a/site/docs/latest/cookbooks/non-persistent-messaging.md b/site/docs/latest/cookbooks/non-persistent-messaging.md new file mode 100644 index 0000000000..9791121a80 --- /dev/null +++ b/site/docs/latest/cookbooks/non-persistent-messaging.md @@ -0,0 +1,66 @@ +--- +title: Non-persistent messaging +--- + +<!-- + + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. + +--> + +**Non-persistent topics** are Pulsar {% popover topics %} in which message data is *never* [persistently stored](../../getting-started/ConceptsAndArchitecture#persistent-storage) and kept only in memory. This cookbook provides: + +* A basic [conceptual overview](#overview) of non-persistent topics +* Information about [configurable parameters](#configuration) related to non-persistent topics +* A guide to the [CLI interface](#cli) for managing non-persistent topics + +## Overview of non-persistent topics {#overview} + +{% include explanations/non-persistent-topics.md %} + +{% include admonition.html type="info" content='For more high-level information about non-persistent topics, see the [Concepts and Architecture](../../getting-started/ConceptsAndArchitecture#non-persistent-topics) documentation.' %} + +## Using non-persistent topics {#using} + +{% include admonition.html type="warning" content='In order to use non-persistent topics, they must be [enabled](#enabling) in your Pulsar broker configuration.' %} + +In order to use non-persistent topics, you only need to differentiate them by name when interacting with them. This [`pulsar-client produce`](../../CliTools#pulsar-client-produce) command, for example, would produce one message on a non-persistent topic in a {% popover standalone %} cluster: + +```bash +$ bin/pulsar-client produce non-persistent://sample/standalone/ns1/example-np-topic \ + --num-produce 1 \ + --messages "This message will be stored only in memory" +``` + +{% include admonition.html type="success" content="For a more thorough guide to non-persistent topics from an administrative perspective, see the [Non-persistent topics](../../admin-api/non-persistent-topics) guide." %} + +## Enabling non-persistent topics {#enabling} + +In order to enable non-persistent topics in a Pulsar {% popover broker %}, the [`enableNonPersistentTopics`](../../reference/Configuration#broker-enableNonPersistentTopics) must be set to `true`. This is the default, and so you won't need to take any action to enable non-persistent messaging. + +{% include admonition.html type="info" title="Configuration for standalone mode" content="If you're running Pulsar in standalone mode, the same configurable parameters are available but in the [`standalone.conf`](../../reference/Configuration#standalone) configuration file." %} + +If you'd like to enable *only* non-persistent topics in a broker, you can set the [`enablePersistentTopics`](../../reference/Configuration#broker-enablePersistentTopics) parameter to `false` and the `enableNonPersistentTopics` parameter to `true`. + +## Managing non-persistent topics via the CLI {#cli} + +Non-persistent topics can be managed using the [`pulsar-admin non-persistent`](../../reference/CliTools#pulsar-admin-non-persistent) command-line interface. With that interface you can perform actions like [create a partitioned non-persistent topic](../../reference/CliTools#pulsar-admin-non-persistent-create-partitioned-topic), get [stats](../../reference/CliTools#pulsar-admin-non-persistent-stats) for a non-persistent topic, [list](../../) non-persistent topics under a namespace, and more. + +## Non-persistent topics and Pulsar clients {#clients} + +You shouldn't need to make any changes to your Pulsar clients to use non-persistent messaging beyond making sure that you use proper [topic names](#using) with `non-persistent` as the topic type. \ No newline at end of file diff --git a/site/docs/latest/getting-started/ConceptsAndArchitecture.md b/site/docs/latest/getting-started/ConceptsAndArchitecture.md index 66c7558741..cd92a95b32 100644 --- a/site/docs/latest/getting-started/ConceptsAndArchitecture.md +++ b/site/docs/latest/getting-started/ConceptsAndArchitecture.md @@ -108,7 +108,7 @@ As in other pub-sub systems, topics in Pulsar are named channels for transmittin | `topic` | The final part of the name. Topic names are freeform and have no special meaning in a Pulsar instance. | {% include admonition.html type="success" title="No need to explicitly create new topics" -content="Application does not explicitly create the topic but attempting to write or receive message on a topic that does not yet exist, Pulsar will automatically create that topic under the [namespace](#namespace)." %} +content="You don't need to explicitly create topics in Pulsar. If a client attempts to write or receive messages to/from a topic that does not yet exist, Pulsar will automatically create that topic under the [namespace](#namespace) provided in the [topic name](#topics)." %} ### Namespace @@ -192,6 +192,52 @@ For code examples, see: {% include explanations/non-persistent-topics.md %} +{% include admonition.html type="success" content='For more info on using non-persistent topics, see the [Non-persistent messaging cookbook](../../cookbooks/non-persistent-topics).' %} + +In non-persistent topics, {% popover brokers %} immediately deliver messages to all connected subscribers *without persisting them* in [BookKeeper](#persistent-storage). If a subscriber is disconnected, the broker will not be able to deliver those in-transit messages, and subscribers will never be able to receive those messages again. Eliminating the persistent storage step makes messaging on non-persistent topics slightly faster than on persistent topics in some cases, but with the caveat that some of the core benefits of Pulsar are lost. + +{% include admonition.html type="danger" content="With non-persistent topics, message data lives only in memory. If a message broker fails or message data can otherwise not be retrieved from memory, your message data may be lost. Use non-persistent topics only if you're *certain* that your use case requires it and can sustain it." %} + +By default, non-persistent topics are enabled on Pulsar {% popover brokers %}. You can disable them in the broker's [configuration](../../reference/Configuration#broker-enableNonPersistentTopics). You can manage non-persistent topics using the [`pulsar-admin non-persistent`](../../reference/CliTools#pulsar-admin-non-persistent) interface. + +#### Performance + +Non-persistent messaging is usually faster than persistent messaging because brokers don't persist messages and immediately send acks back to the producer as soon as that message is deliver to all connected subscribers. Producers thus see comparatively low publish latency with non-persistent topic. + +#### Client API + +Producers and consumers can connect to non-persistent topics in the same way as persistent topics, with the crucial difference that the topic name must start with `non-persistent`. All three subscription modes---[exclusive](#exclusive), [shared](#shared), and [failover](#failover)---are supported for non-persistent topics. + +Here's an example [Java consumer](../../clients/Java#consumer) for a non-persistent topic: + +```java +PulsarClient client = PulsarClient.create("pulsar://localhost:6650"); +String npTopic = "non-persistent://sample/standalone/ns1/my-topic"; +String subscriptionName = "my-subscription-name"; + +Consumer consumer = client.subscribe(npTopic, subscriptionName); +``` + +Here's an example [Java producer](../../clients/Java#producer) for the same non-persistent topic: + +```java +Producer producer = client.createProducer(npTopic); +``` + +#### Broker configuration + +Sometimes, there would be a need to configure few dedicated brokers in a cluster, to just serve non-persistent topics. + +Broker configuration for enabling broker to own only configured type of topics + +``` +# It disables broker to load persistent topics +enablePersistentTopics=false +# It enables broker to load non-persistent topics +enableNonPersistentTopics=true +``` + + ## Architecture overview At the highest level, a Pulsar {% popover instance %} is composed of one or more Pulsar {% popover clusters %}. Clusters within an instance can [replicate](#replicate) data amongst themselves. @@ -252,12 +298,12 @@ When creating a [new cluster](../../admin/ClustersBrokers#initialize-cluster-met ## Persistent storage -![Brokers and bookies](/img/broker-bookie.png) - Pulsar provides guaranteed message delivery for applications. If a message successfully reaches a Pulsar {% popover broker %}, it will be delivered to its intended target. This guarantee requires that non-{% popover acknowledged %} messages are stored in a durable manner until they can be delivered to and acknowledged by {% popover consumers %}. This mode of messaging is commonly called *persistent messaging*. In Pulsar, N copies of all messages are stored and synced on disk, for example 4 copies across two servers with mirrored [RAID](https://en.wikipedia.org/wiki/RAID) volumes on each server. +### Apache BookKeeper {#bookkeeper} + Pulsar uses a system called [Apache BookKeeper](http://bookkeeper.apache.org/) for persistent message storage. BookKeeper is a distributed [write-ahead log](https://en.wikipedia.org/wiki/Write-ahead_logging) (WAL) system that provides a number of crucial advantages for Pulsar: * It enables Pulsar to utilize many independent logs, called [ledgers](#ledgers). Multiple ledgers can be created for {% popover topics %} over time. @@ -273,7 +319,11 @@ At the moment, Pulsar only supports persistent message storage. This accounts fo {% include topic.html p="my-property" c="global" n="my-namespace" t="my-topic" %} -In the future, Pulsar will support ephemeral message storage. +{% include admonition.html type="success" content='Pulsar also supports ephemeral ([non-persistent](#non-persistent-topics)) message storage.' %} + +You can see an illustration of how {% popover brokers %} and {% popover bookies %} interact in the diagram below: + +![Brokers and bookies](/img/broker-bookie.png) ### Ledgers ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services