tisonkun commented on code in PR #17475: URL: https://github.com/apache/pulsar/pull/17475#discussion_r963294985
########## site2/docs/getting-started-standalone.md: ########## @@ -1,278 +1,150 @@ --- id: getting-started-standalone -title: Set up a standalone Pulsar locally +title: Run a standalone Pulsar cluster locally sidebar_label: "Run Pulsar locally" --- -For local development and testing, you can run Pulsar in standalone mode on your machine. The standalone mode includes a Pulsar broker, the necessary [RocksDB](http://rocksdb.org/) and BookKeeper components running inside of a single Java Virtual Machine (JVM) process. - -> **Pulsar in production?** -> If you're looking to run a full production Pulsar installation, see the [Deploying a Pulsar instance](deploy-bare-metal.md) guide. - -## Install Pulsar standalone - -This tutorial guides you through every step of installing Pulsar locally. - -### System requirements - -Currently, Pulsar is available for 64-bit **macOS**, **Linux**, and **Windows**. To use Pulsar, you need to install 64-bit JRE/JDK. -For the runtime Java version, see [Pulsar Runtime Java Version Recommendation](https://github.com/apache/pulsar/blob/master/README.md#pulsar-runtime-java-version-recommendation) according to your target Pulsar version. +For local development and testing, you can run Pulsar in standalone mode on your machine. The standalone mode runs all components inside a single Java Virtual Machine (JVM) process. :::tip -By default, Pulsar allocates 2G JVM heap memory to start. It can be changed in `conf/pulsar_env.sh` file under `PULSAR_MEM`. This is an extra option passed into JVM. +If you're looking to run a full production Pulsar installation, see the [Deploying a Pulsar instance](deploy-bare-metal.md) guide. ::: -:::note +## Prerequisites -Broker is only supported on 64-bit JVM. +Nothing more than a 64-bit JRE is required to run a standalone Pulsar cluster. For the required JRE version, see [Pulsar Runtime Java Version Recommendation](https://github.com/apache/pulsar/blob/master/README.md#pulsar-runtime-java-version-recommendation) according to your target Pulsar version. -::: +## Download Pulsar distribution -#### Install JDK on M1 -In the current version, Pulsar uses a BookKeeper version which in turn uses RocksDB. RocksDB is compiled to work on x86 architecture and not ARM. Therefore, Pulsar can only work with x86 JDK. This is planned to be fixed in future versions of Pulsar. +Download the official Apache Pulsar distribution: -One of the ways to easily install an x86 JDK is to use [SDKMan](http://sdkman.io). Follow instructions on the SDKMan website. - -2. Turn on Rosetta2 compatibility for SDKMan by editing `~/.sdkman/etc/config` and changing the following property from `false` to `true`. - -```properties -sdkman_rosetta2_compatible=true +```bash +wget https://archive.apache.org/dist/pulsar/pulsar-@pulsar:version@/apache-pulsar-@pulsar:[email protected] ``` -3. Close the current shell / terminal window and open a new one. -4. Make sure you don't have any previously installed JVM of the same version by listing existing installed versions. +Once downloaded, unpack the tar file: -```shell -sdk list java|grep installed +```bash +tar xvfz apache-pulsar-@pulsar:[email protected] ``` -Example output: +For the rest of this quickstart we'll run commands from the root of the distribution folder, so switch to it: -```text - | >>> | 17.0.3.6.1 | amzn | installed | 17.0.3.6.1-amzn +```bash +cd apache-pulsar-@pulsar:version@ ``` -If you have any Java 17 version installed, uninstall it. - -```shell -sdk uinstall java 17.0.3.6.1 -``` +## Browse Pulsar distribution -5. Install any Java versions greater than Java 8. +List the contents by executing: -```shell - sdk install java 17.0.3.6.1-amzn +```bash +ls -1F ``` -### Install Pulsar using binary release +You will see it layouts as: -To get started with Pulsar, download a binary tarball release in one of the following ways: +```text +LICENSE +NOTICE +README +bin/ +conf/ +examples/ +instances/ +lib/ +licenses/ +``` -* download from the Apache mirror (<a href="pulsar:binary_release_url" download>Pulsar @pulsar:version@ binary release</a>) +You may want to note that: -* download from the Pulsar [downloads page](pulsar:download_page_url) - -* download from the Pulsar [releases page](https://github.com/apache/pulsar/releases/latest) - -* use [wget](https://www.gnu.org/software/wget): +* `bin` directory contains the [`pulsar`](reference-cli-tools.md#pulsar) entry point script, and many other command-line tools. +* `conf` directory contains configuration files, including `broker.conf`. +* `lib` directory contains JARs used by Pulsar. +* `examples` directory contains [Pulsar Functions](functions-overview.md) examples. +* `instances` directory artifacts for [Pulsar Functions](functions-overview.md). - ```shell - wget pulsar:binary_release_url - ``` +## Start the Pulsar standalone cluster -After you download the tarball, untar it and use the `cd` command to navigate to the resulting directory: +Run this command to start a standalone Pulsar cluster: ```bash -tar xvfz apache-pulsar-@pulsar:[email protected] -cd apache-pulsar-@pulsar:version@ +bin/pulsar standalone ``` -#### What your package contains - -The Pulsar binary package initially contains the following directories: - -Directory | Contains -:---------|:-------- -`bin` | Pulsar's command-line tools, such as [`pulsar`](reference-cli-tools.md#pulsar) and [`pulsar-admin`](/tools/pulsar-admin/). -`conf` | Configuration files for Pulsar, including [broker configuration](reference-configuration.md#broker) and more.<br />**Note:** Pulsar standalone uses RocksDB as the local metadata store and its configuration file path [`metadataStoreConfigPath`](reference-configuration.md) is configurable in the `standalone.conf` file. For more information about the configurations of RocksDB, see [here](https://github.com/facebook/rocksdb/blob/main/examples/rocksdb_option_file_example.ini) and related [documentation](https://github.com/facebook/rocksdb/wiki/RocksDB-Tuning-Guide). -`examples` | A Java JAR file containing [Pulsar Functions](functions-overview.md) example. -`instances` | Artifacts created for [Pulsar Functions](functions-overview.md). -`lib` | The [JAR](https://en.wikipedia.org/wiki/JAR_(file_format)) files used by Pulsar. -`licenses` | License files, in the`.txt` form, for various components of the Pulsar [codebase](https://github.com/apache/pulsar). - -These directories are created once you begin running Pulsar. - -Directory | Contains -:---------|:-------- -`data` | The data storage directory used by RocksDB and BookKeeper. -`logs` | Logs created by the installation. - -:::tip - -If you want to use built-in connectors and tiered storage offloaders, you can install them according to the following instructions: -* [Install built-in connectors (optional)](#install-built-in-connectors-optional) -* [Install tiered storage offloaders (optional)](#install-tiered-storage-offloaders-optional) -Otherwise, skip this step and perform the next step [Start Pulsar standalone](#start-pulsar-standalone). Pulsar can be successfully installed without installing built-in connectors and tiered storage offloaders. - -::: - -### Install built-in connectors (optional) - -Since `2.1.0-incubating` release, Pulsar releases a separate binary distribution, containing all the `built-in` connectors. -To enable those `built-in` connectors, you can download the connectors tarball release in one of the following ways: - -* download from the Apache mirror <a href="pulsar:connector_release_url" download>Pulsar IO Connectors @pulsar:version@ release</a> - -* download from the Pulsar [downloads page](pulsar:download_page_url) - -* download from the Pulsar [releases page](https://github.com/apache/pulsar/releases/latest) - -* use [wget](https://www.gnu.org/software/wget): - - ```shell - wget pulsar:connector_release_url/{connector}-@pulsar:[email protected] - ``` - -After you download the NAR file, copy the file to the `connectors` directory in the pulsar directory. -For example, if you download the `pulsar-io-aerospike-@pulsar:[email protected]` connector file, enter the following commands: +By default, the standalone mode runs a RocksDB instance for metadat storage. If you'd prefer to start a cluster with standalone ZooKeeper server, set `PULSAR_STANDALONE_USE_ZOOKEEPER` to 1: ```bash -mkdir connectors -mv pulsar-io-aerospike-@pulsar:[email protected] connectors - -ls connectors -pulsar-io-aerospike-@pulsar:[email protected] -... +PULSAR_STANDALONE_USE_ZOOKEEPER=1 bin/pulsar standalone ``` -:::note +These directories are created once you started the Pulsar cluster. -* If you are running Pulsar in a bare metal cluster, make sure `connectors` tarball is unzipped in every pulsar directory of the broker (or in every pulsar directory of function-worker if you are running a separate worker cluster for Pulsar Functions). -* If you are [running Pulsar in Docker](getting-started-docker.md) or deploying Pulsar using a docker image (e.g. [K8S](deploy-kubernetes.md) or [DC/OS](https://dcos.io/), you can use the `apachepulsar/pulsar-all` image instead of the `apachepulsar/pulsar` image. `apachepulsar/pulsar-all` image has already bundled [all built-in connectors](io-overview.md#working-with-connectors). +* `data` directory stores all data created by BookKeeper and RocksDB. +* `logs` directory contains all server-side logs. -::: +## Create a topic -### Install tiered storage offloaders (optional) +Pulsar stores messages in topics. It's good practice to explicitly create them before using them, even if Pulsar can automagically create them when referenced. -:::tip - -- Since `2.2.0` release, Pulsar releases a separate binary distribution, containing the tiered storage offloaders. -- To enable the tiered storage feature, follow the instructions below; otherwise skip this section. - -::: - -To get started with [tiered storage offloaders](concepts-tiered-storage.md), you need to download the offloaders tarball release on every broker node in one of the following ways: - -* download from the Apache mirror <a href="pulsar:offloader_release_url" download>Pulsar Tiered Storage Offloaders @pulsar:version@ release</a> - -* download from the Pulsar [downloads page](pulsar:download_page_url) - -* download from the Pulsar [releases page](https://github.com/apache/pulsar/releases/latest) - -* use [wget](https://www.gnu.org/software/wget): - - ```shell - wget pulsar:offloader_release_url - ``` - -After you download the tarball, untar the offloaders package and copy the offloaders as `offloaders` -in the pulsar directory: +Run this command to create a new topic into which we'll write and read some test messages: ```bash -tar xvfz apache-pulsar-offloaders-@pulsar:[email protected] - -// you will find a directory named `apache-pulsar-offloaders-@pulsar:version@` in the pulsar directory -// then copy the offloaders - -mv apache-pulsar-offloaders-@pulsar:version@/offloaders offloaders - -ls offloaders -tiered-storage-jcloud-@pulsar:[email protected] +bin/pulsar-admin topics create persistent://public/default/quickstart ``` -For more information on how to configure tiered storage, see [Tiered storage cookbook](cookbooks-tiered-storage.md). - -:::note - -* If you are running Pulsar in a bare metal cluster, make sure that `offloaders` tarball is unzipped in every broker's pulsar directory. -* If you are [running Pulsar in Docker](getting-started-docker.md) or deploying Pulsar using a docker image (e.g. [K8S](deploy-kubernetes.md) or DC/OS), you can use the `apachepulsar/pulsar-all` image instead of the `apachepulsar/pulsar` image. `apachepulsar/pulsar-all` image has already bundled tiered storage offloaders. - -::: +## Write messages to the topic -## Start Pulsar standalone +You can use the `pulsar` command line tool to write messages to a topic. This is useful for experimentation, but in practice you'll use the Producer API in your application code, or Pulsar IO connectors for pulling data in from other systems to Pulsar. -Once you have an up-to-date local copy of the release, you can start a local cluster using the [`pulsar`](reference-cli-tools.md#pulsar) command, which is stored in the `bin` directory, and specifying that you want to start Pulsar in standalone mode. +Run this command to produce a message: ```bash -bin/pulsar standalone +bin/pulsar-client produce quickstart --messages 'Hello Pulsar!' ``` -If you have started Pulsar successfully, you will see `INFO`-level log messages like this: +## Read messages from the topic -```bash -21:59:29.327 [DLM-/stream/storage-OrderedScheduler-3-0] INFO org.apache.bookkeeper.stream.storage.impl.sc.StorageContainerImpl - Successfully started storage container (0). -21:59:34.576 [main] INFO org.apache.pulsar.broker.authentication.AuthenticationService - Authentication is disabled -21:59:34.576 [main] INFO org.apache.pulsar.websocket.WebSocketService - Pulsar WebSocket Service started -``` +Now that we've written message to the topic, we'll read those messages back. -:::tip - -* The service is running on your terminal, which is under your direct control. If you need to run other commands, open a new terminal window. -* To run the service as a background process, you can use the `bin/pulsar-daemon start standalone` command. For more information, see [pulsar-daemon](/docs/en/reference-cli-tools/#pulsar-daemon). -* To perform a health check, you can use the `bin/pulsar-admin brokers healthcheck` command. For more information, see [Pulsar-admin docs](/tools/pulsar-admin/). -* When you start a local standalone cluster, a `public/default` [namespace](concepts-messaging.md#namespaces) is created automatically. The namespace is used for development purposes. All Pulsar topics are managed within namespaces. For more information, see [Topics](concepts-messaging.md#topics). -* By default, there is no encryption, authentication, or authorization configured. Apache Pulsar can be accessed from a remote server without any authorization. See [Security Overview](security-overview.md) for how to secure your deployment. - -::: - -## Use Pulsar standalone - -Pulsar provides a CLI tool called [`pulsar-client`](reference-cli-tools.md#pulsar-client). The pulsar-client tool enables you to consume and produce messages to a Pulsar topic in a running cluster. - -### Consume a message - -The following command consumes a message with the subscription name `first-subscription` to the `my-topic` topic: +Run this command to launch the consumer: ```bash -bin/pulsar-client consume my-topic -s "first-subscription" +bin/pulsar-client consume quickstart -s 'first-subscription' -p Earliest -n 0 Review Comment: ``` bin/pulsar-client consume topic1 -s 'subscription1' ``` consumes only one message from the latest. It's hard to write a simple demo. I'd say `bin/pulsar-client consume topic1 -s 'subscription1' -p Earliest -n 0` this is the relative simple example :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
