This is an automated email from the ASF dual-hosted git repository.
sijie pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-pulsar.git
The following commit(s) were added to refs/heads/master by this push:
new 8655d53 [documentation] Improve documentation on bare mental
deployment (#2335)
8655d53 is described below
commit 8655d53b62c6e8f25cc629f7922f52e23b07c768
Author: Sijie Guo <[email protected]>
AuthorDate: Fri Aug 10 19:40:23 2018 -0700
[documentation] Improve documentation on bare mental deployment (#2335)
### Motiviation
Fixes #2329
### Changes
- How to enable state storage for stateful functions
- How to enable function worker
- How to install builtin connectors
- Instructions to test functions
---
deployment/terraform-ansible/aws/setup-disk.yaml | 36 ++++++
deployment/terraform-ansible/deploy-pulsar.yaml | 20 +---
.../terraform-ansible/templates/bookkeeper.conf | 2 +-
site2/docs/deploy-aws.md | 38 +++++-
site2/docs/deploy-bare-metal.md | 127 ++++++++++++++++++++-
5 files changed, 199 insertions(+), 24 deletions(-)
diff --git a/deployment/terraform-ansible/aws/setup-disk.yaml
b/deployment/terraform-ansible/aws/setup-disk.yaml
new file mode 100644
index 0000000..e1360c0
--- /dev/null
+++ b/deployment/terraform-ansible/aws/setup-disk.yaml
@@ -0,0 +1,36 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied. See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+
+- name: Disk setup
+ hosts: pulsar
+ connection: ssh
+ become: true
+ tasks:
+ - command: >
+ tuned-adm profile latency-performance
+ - name: Create and mount disks
+ mount:
+ path: "{{ item.path }}"
+ src: "{{ item.src }}"
+ fstype: xfs
+ opts: defaults,noatime,nodiscard
+ state: present
+ with_items:
+ - { path: "/mnt/journal", src: "/dev/nvme0n1" }
+ - { path: "/mnt/storage", src: "/dev/nvme1n1" }
diff --git a/deployment/terraform-ansible/deploy-pulsar.yaml
b/deployment/terraform-ansible/deploy-pulsar.yaml
index 78dc3a1..d586a5e 100644
--- a/deployment/terraform-ansible/deploy-pulsar.yaml
+++ b/deployment/terraform-ansible/deploy-pulsar.yaml
@@ -17,24 +17,6 @@
# under the License.
#
-- name: Disk setup
- hosts: pulsar
- connection: ssh
- become: true
- tasks:
- - command: >
- tuned-adm profile latency-performance
- - name: Create and mount disks
- mount:
- path: "{{ item.path }}"
- src: "{{ item.src }}"
- fstype: xfs
- opts: defaults,noatime,nodiscard
- state: present
- with_items:
- - { path: "/mnt/journal", src: "/dev/nvme0n1" }
- - { path: "/mnt/storage", src: "/dev/nvme1n1" }
-
- name: Pulsar setup
hosts: all
connection: ssh
@@ -56,7 +38,7 @@
zookeeper_servers: "{{ groups['zookeeper']|map('extract', hostvars,
['ansible_default_ipv4', 'address'])|map('regex_replace', '(.*)', '\\1:2181') |
join(',') }}"
service_url: "pulsar://{{ hostvars[groups['pulsar'][0]].public_ip
}}:6650/"
http_url: "http://{{ hostvars[groups['pulsar'][0]].public_ip }}:8080/"
- pulsar_version: "2.0.0-rc1-incubating"
+ pulsar_version: "2.1.0-incubating"
- name: Download Pulsar binary package
unarchive:
diff --git a/deployment/terraform-ansible/templates/bookkeeper.conf
b/deployment/terraform-ansible/templates/bookkeeper.conf
index 2f5dd4a..9e7fcc9 100644
--- a/deployment/terraform-ansible/templates/bookkeeper.conf
+++ b/deployment/terraform-ansible/templates/bookkeeper.conf
@@ -53,7 +53,7 @@ minUsableSizeForIndexFileCreation=1073741824
# Configure a specific hostname or IP address that the bookie should use to
advertise itself to
# clients. If not set, bookie will advertised its own IP address or hostname,
depending on the
# listeningInterface and `seHostNameAsBookieID settings.
-advertisedAddress={{ hostvars[inventory_hostname].public_ip }}
+# advertisedAddress=
# Whether the bookie allowed to use a loopback interface as its primary
# interface(i.e. the interface it uses to establish its identity)?
diff --git a/site2/docs/deploy-aws.md b/site2/docs/deploy-aws.md
index 1ef3cc3..f827177 100644
--- a/site2/docs/deploy-aws.md
+++ b/site2/docs/deploy-aws.md
@@ -41,6 +41,24 @@ $ cd incubator-pulsar/deployment/terraform-ansible/aws
## SSH setup
+> If you already have an SSH key and would like to use it, you skip generating
the SSH keys and update `private_key_file` setting
+> in `ansible.cfg` file and `public_key_path` setting in `terraform.tfvars`
file.
+>
+> For example, if you already had a private SSH key in `~/.ssh/pulsar_aws` and
a public key in `~/.ssh/pulsar_aws.pub`,
+> you can do followings:
+>
+> 1. update `ansible.cfg` with following values:
+>
+> ```shell
+> private_key_file=~/.ssh/pulsar_aws
+> ```
+>
+> 2. update `terraform.tfvars` with following values:
+>
+> ```shell
+> public_key_path=~/.ssh/pulsar_aws.pub
+> ```
+
In order to create the necessary AWS resources using Terraform, you'll need to
create an SSH key. To create a private SSH key in `~/.ssh/id_rsa` and a public
key in `~/.ssh/id_rsa.pub`:
```bash
@@ -133,6 +151,25 @@ At any point, you can destroy all AWS resources associated
with your cluster usi
$ terraform destroy
```
+## Setup Disks
+
+Before you run the Pulsar playbook, you want to mount the disks to the correct
directories on those bookie nodes.
+Since different type of machines would have different disk layout, if you
change the `instance_types` in your terraform
+config, you need to update the task defined in `setup-disk.yaml` file.
+
+To setup disks on bookie nodes, use this command:
+
+```bash
+$ ansible-playbook \
+ --user='ec2-user' \
+ --inventory=`which terraform-inventory` \
+ setup-disk.yaml
+```
+
+After running this command, the disks will be mounted under `/mnt/journal` as
journal disk, and `/mnt/storage` as ledger disk.
+It is important to run this command only once! If you attempt to run this
command again after you have run Pulsar playbook,
+it might be potentially erase your disks again and cause the bookies to fail
to start up.
+
## Running the Pulsar playbook
Once you've created the necessary AWS resources using Terraform, you can
install and run Pulsar on the Terraform-created EC2 instances using Ansible. To
do so, use this command:
@@ -150,7 +187,6 @@ If you've created a private SSH key at a location different
from `~/.ssh/id_rsa`
$ ansible-playbook \
--user='ec2-user' \
--inventory=`which terraform-inventory` \
- --private-key="~/.ssh/some-non-default-key" \
../deploy-pulsar.yaml
```
diff --git a/site2/docs/deploy-bare-metal.md b/site2/docs/deploy-bare-metal.md
index c5ea893..78ea557 100644
--- a/site2/docs/deploy-bare-metal.md
+++ b/site2/docs/deploy-bare-metal.md
@@ -17,18 +17,23 @@ sidebar_label: Bare metal
Deploying a Pulsar cluster involves doing the following (in order):
-* Deploying a [ZooKeeper](#deploying-a-zookeeper-cluster) cluster
+* Deploying a [ZooKeeper](#deploying-a-zookeeper-cluster) cluster (optional)
* Initializing [cluster metadata](#initializing-cluster-metadata)
* Deploying a [BookKeeper](#deploying-a-bookkeeper-cluster) cluster
* Deploying one or more Pulsar [brokers](#deploying-pulsar-brokers)
+## Preparation
+
### Requirements
+> If you already have an existing zookeeper cluster and would like to reuse
it, you don't need to prepare the machines
+> for running ZooKeeper.
+
To run Pulsar on bare metal, you will need:
* At least 6 Linux machines or VMs
* 3 running [ZooKeeper](https://zookeeper.apache.org)
- * 3 running a Pulsar broker and a
[BookKeeper](https://bookkeeper.apache.org) bookie
+ * 3 running a Pulsar broker, and a
[BookKeeper](https://bookkeeper.apache.org) bookie
* A single [DNS](https://en.wikipedia.org/wiki/Domain_Name_System) name
covering all of the Pulsar broker hosts
Each machine in your cluster will need to have [Java
8](http://www.oracle.com/technetwork/java/javase/downloads/index.html) or
higher installed.
@@ -43,8 +48,12 @@ In this diagram, connecting clients need to be able to
communicate with the Puls
When deploying a Pulsar cluster, we have some basic recommendations that you
should keep in mind when capacity planning.
+#### ZooKeeper
+
For machines running ZooKeeper, we recommend using lighter-weight machines or
VMs. Pulsar uses ZooKeeper only for periodic coordination- and
configuration-related tasks, *not* for basic operations. If you're running
Pulsar on [Amazon Web Services](https://aws.amazon.com/) (AWS), for example, a
[t2.small](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/t2-instances.html)
instance would likely suffice.
+#### Bookies & Brokers
+
For machines running a bookie and a Pulsar broker, we recommend using more
powerful machines. For an AWS deployment, for example,
[i3.4xlarge](https://aws.amazon.com/blogs/aws/now-available-i3-instances-for-demanding-io-intensive-applications/)
instances may be appropriate. On those machines we also recommend:
* Fast CPUs and 10Gbps
[NIC](https://en.wikipedia.org/wiki/Network_interface_controller) (for Pulsar
brokers)
@@ -83,8 +92,52 @@ Directory | Contains
`lib` | The [JAR](https://en.wikipedia.org/wiki/JAR_(file_format)) files used
by Pulsar.
`logs` | Logs created by the installation.
+## Installing Builtin Connectors (optional)
+
+> Since release `2.1.0-incubating`, Pulsar releases a separate binary
distribution, containing all the `builtin` connectors.
+> If you would like to enable those `builtin` connectors, you can follow the
instructions as below; otherwise you can
+> skip this section for now.
+
+To get started using builtin connectors, you'll need to download the
connectors tarball release on every broker node in
+one of the following ways:
+
+* by clicking the link below and downloading the release from an Apache mirror:
+
+ * <a href="pulsar:connector_release_url" download>Pulsar IO Connectors
{{pulsar:version}} release</a>
+
+* from the Pulsar [downloads page](pulsar:download_page_url)
+* from the Pulsar [releases
page](https://github.com/apache/incubator-pulsar/releases/latest)
+* using [wget](https://www.gnu.org/software/wget):
+
+ ```shell
+ $ wget pulsar:connector_release_url
+ ```
+
+Once the tarball is downloaded, in the pulsar directory, untar the
io-connectors package and copy the connectors as `connectors`
+in the pulsar directory:
+
+```bash
+$ tar xvfz apache-pulsar-io-connectors-{{pulsar:version}}-bin.tar.gz
+
+// you will find a directory named
`apache-pulsar-io-connectors-{{pulsar:version}}` in the pulsar directory
+// then copy the connectors
+
+$ mv apache-pulsar-io-connectors-{{pulsar:version}}/connectors connectors
+
+$ ls connectors
+pulsar-io-aerospike-{{pulsar.version}}.nar
+pulsar-io-cassandra-{{pulsar.version}}.nar
+pulsar-io-kafka-{{pulsar.version}}.nar
+pulsar-io-kinesis-{{pulsar.version}}.nar
+pulsar-io-rabbitmq-{{pulsar.version}}.nar
+pulsar-io-twitter-{{pulsar.version}}.nar
+...
+```
+
## Deploying a ZooKeeper cluster
+> If you already have an exsiting zookeeper cluster and would like to use it,
you can skip this section.
+
[ZooKeeper](https://zookeeper.apache.org) manages a variety of essential
coordination- and configuration-related tasks for Pulsar. To deploy a Pulsar
cluster you'll need to deploy ZooKeeper first (before all other components). We
recommend deploying a 3-node ZooKeeper cluster. Pulsar does not make heavy use
of ZooKeeper, so more lightweight machines or VMs should suffice for running
ZooKeeper.
To begin, add all ZooKeeper servers to the configuration specified in
[`conf/zookeeper.conf`](reference-configuration.md#zookeeper) (in the Pulsar
directory you created [above](#installing-the-pulsar-binary-package)). Here's
an example:
@@ -155,6 +208,15 @@
zkServers=zk1.us-west.example.com:2181,zk2.us-west.example.com:2181,zk3.us-west.
Once you've appropriately modified the `zkServers` parameter, you can provide
any other configuration modifications you need. You can find a full listing of
the available BookKeeper configuration parameters
[here](reference-configuration.md#bookkeeper), although we would recommend
consulting the [BookKeeper
documentation](http://bookkeeper.apache.org/docs/latest/reference/config/) for
a more in-depth guide.
+> ##### NOTES
+>
+> Since Pulsar 2.1.0 release, Pulsar introduces [stateful
function](functions-state.md) for Pulsar Functions. If you would like to enable
that feature,
+> you need to enable table service on BookKeeper by setting following setting
in `conf/bookkeeper.conf` file.
+>
+> ```conf
+>
extraServerComponents=org.apache.bookkeeper.stream.server.StreamStorageLifecycleComponent
+> ```
+
Once you've applied the desired configuration in `conf/bookkeeper.conf`, you
can start up a bookie on each of your BookKeeper hosts. You can start up each
bookie either in the background, using
[nohup](https://en.wikipedia.org/wiki/Nohup), or in the foreground.
To start the bookie in the background, use the
[`pulsar-daemon`](reference-cli-tools.md#pulsar-daemon) CLI tool:
@@ -169,7 +231,7 @@ To start the bookie in the foreground:
$ bin/bookkeeper bookie
```
-You can verify that the bookie is working properly using the `bookiesanity`
command for the [BookKeeper
shell](http://localhost:4000/docs/latest/deployment/reference/CliTools#bookkeeper-shell):
+You can verify that a bookie is working properly by running the `bookiesanity`
command for the [BookKeeper shell](reference-cli-tools.md#shell) on it:
```bash
$ bin/bookkeeper shell bookiesanity
@@ -177,10 +239,22 @@ $ bin/bookkeeper shell bookiesanity
This will create an ephemeral BookKeeper ledger on the local bookie, write a
few entries, read them back, and finally delete the ledger.
+After you have started all the bookies, you can use `simpletest` command for
[BookKeeper shell](reference-cli-tools.md#shell) on any bookie node, to
+verify all the bookies in the cluster are up running.
+
+```bash
+$ bin/bookkeeper shell simpletest --ensemble <num-bookies> --writeQuorum
<num-bookies> --ackQuorum <num-bookies> --numEntries <num-entries>
+```
+
+This command will create a `num-bookies` sized ledger on the cluster, write a
few entries, and finally delete the ledger.
+
+
## Deploying Pulsar brokers
Pulsar brokers are the last thing you need to deploy in your Pulsar cluster.
Brokers handle Pulsar messages and provide Pulsar's administrative interface.
We recommend running **3 brokers**, one for each machine that's already running
a BookKeeper bookie.
+### Configuring Brokers
+
The most important element of broker configuration is ensuring that that each
broker is aware of the ZooKeeper cluster that you've deployed. Make sure that
the [`zookeeperServers`](reference-configuration.md#broker-zookeeperServers)
and
[`configurationStoreServers`](reference-configuration.md#broker-configurationStoreServers)
parameters. In this case, since we only have 1 cluster and no configuration
store setup, the `configurationStoreServers` will point to the same
`zookeeperServers`.
```properties
@@ -194,6 +268,24 @@ You also need to specify the cluster name (matching the
name that you provided w
clusterName=pulsar-cluster-1
```
+### Enabling Pulsar Functions (optional)
+
+If you want to enable [Pulsar Functions](functions-overview.md), you can
follow the instructions as below:
+
+1. Edit `conf/broker.conf` to enable function worker, by setting
`functionsWorkerEnabled` to `true`.
+
+ ```conf
+ functionsWorkerEnabled=true
+ ```
+
+2. Edit `conf/functions_worker.yml` and set `pulsarFunctionsCluster` to the
cluster name that you provided when [initializing the cluster's
metadata](#initializing-cluster-metadata).
+
+ ```conf
+ pulsarFunctionsCluster=pulsar-cluster-1
+ ```
+
+### Starting Brokers
+
You can then provide any other configuration changes that you'd like in the
[`conf/broker.conf`](reference-configuration.md#broker) file. Once you've
decided on a configuration, you can start up the brokers for your Pulsar
cluster. Like ZooKeeper and BookKeeper, brokers can be started either in the
foreground or in the background, using nohup.
You can start a broker in the foreground using the [`pulsar
broker`](reference-cli-tools.md#pulsar-broker) command:
@@ -233,3 +325,32 @@ $ bin/pulsar-client produce \
> You may need to use a different cluster name in the topic if you specified a
> cluster name different from `pulsar-cluster-1`.
This will publish a single message to the Pulsar topic.
+
+## Running Functions
+
+> If you have [enabled](#enabling-pulsar-functions-optional) Pulsar Functions,
you can also tryout pulsar functions now.
+
+Create a ExclamationFunction `exclamation`.
+
+```bash
+bin/pulsar-admin functions create \
+ --jar examples/api-examples.jar \
+ --className org.apache.pulsar.functions.api.examples.ExclamationFunction \
+ --inputs persistent://public/default/exclamation-input \
+ --output persistent://public/default/exclamation-output \
+ --tenant public \
+ --namespace default \
+ --name exclamation
+```
+
+Check if the function is running as expected by
[triggering](functions-deploying.md#triggering-pulsar-functions) the function.
+
+```bash
+bin/pulsar-admin functions trigger --name exclamation --triggerValue "hello
world"
+```
+
+You will see output as below:
+
+```shell
+hello world!
+```