sijie closed pull request #2290: [website] fix anchors in functions
documentation pages and add `State` documentation
URL: https://github.com/apache/incubator-pulsar/pull/2290
This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:
As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):
diff --git a/site2/docs/functions-api.md b/site2/docs/functions-api.md
index 247b8f3ce8..a52128cba2 100644
--- a/site2/docs/functions-api.md
+++ b/site2/docs/functions-api.md
@@ -4,7 +4,7 @@ title: The Pulsar Functions API
sidebar_label: API
---
-[Pulsar Functions](functions-overview.md) provides an easy-to-use API that
developers can use to create and manage processing logic for the Apache Pulsar
messaging system. With Pulsar Functions, you can write functions of any level
of complexity in [Java](#java) or [Python](#python) and run them in conjunction
with a Pulsar cluster without needing to run a separate stream processing
engine.
+[Pulsar Functions](functions-overview.md) provides an easy-to-use API that
developers can use to create and manage processing logic for the Apache Pulsar
messaging system. With Pulsar Functions, you can write functions of any level
of complexity in [Java](#functions-for-java) or [Python](#functions-for-python)
and run them in conjunction with a Pulsar cluster without needing to run a
separate stream processing engine.
> For a more in-depth overview of the Pulsar Functions feature, see the
> [Pulsar Functions overview](functions-overview.md).
@@ -19,8 +19,8 @@ Pulsar Functions provide a wide range of functionality but
are based on a very s
You could use Pulsar Functions, for example, to set up the following
processing chain:
-* A [Python](#python) function listens on the `raw-sentences` topic and
"[sanitizes](#example-function)" incoming strings (removing extraneous
whitespace and converting all characters to lower case) and then publishes the
results to a `sanitized-sentences` topic
-* A [Java](#java) function listens on the `sanitized-sentences` topic, counts
the number of times each word appears within a specified time window, and
publishes the results to a `results` topic
+* A [Python](#functions-for-python) function listens on the `raw-sentences`
topic and "[sanitizes](#example-function)" incoming strings (removing
extraneous whitespace and converting all characters to lower case) and then
publishes the results to a `sanitized-sentences` topic
+* A [Java](#functions-for-java) function listens on the `sanitized-sentences`
topic, counts the number of times each word appears within a specified time
window, and publishes the results to a `results` topic
* Finally, a Python function listens on the `results` topic and writes the
results to a MySQL table
### Example function
@@ -42,7 +42,7 @@ Some things to note about this Pulsar Function:
### Example deployment
-Deploying Pulsar Functions is handled by the
[`pulsar-admin`](reference-pulsar-admin.md) CLI tool, in particular the
[`functions`](reference-pulsar-admin.md#functions) command. Here's an example
command that would run our [sanitizer](#example-function) function from above
in [local run](functions-deploying.md#local-run) mode:
+Deploying Pulsar Functions is handled by the
[`pulsar-admin`](reference-pulsar-admin.md) CLI tool, in particular the
[`functions`](reference-pulsar-admin.md#functions) command. Here's an example
command that would run our [sanitizer](#example-function) function from above
in [local run](functions-deploying.md#local-run-mode) mode:
```bash
$ bin/pulsar-admin functions localrun \
@@ -74,7 +74,7 @@ def process(input):
return "{}!".format(input)
```
-This function, however, would use the Pulsar Functions [SDK for
Python](#python-sdk):
+This function, however, would use the Pulsar Functions [SDK for
Python](#python-sdk-functions):
```python
from pulsar import Function
@@ -96,23 +96,24 @@ In both languages, however, you can write your own custom
SerDe logic for more c
### Context
-Both the [Java](#java-sdk) and [Python](#python-sdk) SDKs provide access to a
**context object** that can be used by the function. This context object
provides a wide variety of information and functionality to the function:
+Both the [Java](#java-sdk-functions) and [Python](#python-sdk-functions) SDKs
provide access to a **context object** that can be used by the function. This
context object provides a wide variety of information and functionality to the
function:
* The name and ID of the Pulsar Function
* The message ID of each message. Each Pulsar message is automatically
assigned an ID.
* The name of the topic on which the message was sent
* The names of all input topics as well as the output topic associated with
the function
-* The name of the class used for [SerDe](#serde)
+* The name of the class used for
[SerDe](#serialization-and-deserialization-serde)
* The [tenant](reference-terminology.md#tenant) and namespace associated with
the function
* The ID of the Pulsar Functions instance running the function
* The version of the function
-* The [logger object](#logging) used by the function, which can be used to
create function log messages
+* The [logger object](functions-overview.md#logging) used by the function,
which can be used to create function log messages
* Access to arbitrary [user config](#user-config) values supplied via the CLI
* An interface for recording [metrics](functions-metrics.md)
+* An interface for storing and retrieving state in [state
storage](functions-overview.md#state-storage)
### User config
-When you run or update Pulsar Functions created using the [SDK](#apis), you
can pass arbitrary key/values to them via the command line with the
`--userConfig` flag. Key/values must be specified as JSON. Here's an example of
a function creation command that passes a user config key/value to a function:
+When you run or update Pulsar Functions created using the
[SDK](#available-apis), you can pass arbitrary key/values to them via the
command line with the `--userConfig` flag. Key/values must be specified as
JSON. Here's an example of a function creation command that passes a user
config key/value to a function:
```bash
$ bin/pulsar-admin functions create \
@@ -144,18 +145,18 @@ class WordFilter(Function):
Writing Pulsar Functions in Java involves implementing one of two interfaces:
* The
[`java.util.Function`](https://docs.oracle.com/javase/8/docs/api/java/util/function/Function.html)
interface
-* The {@inject:
javadoc:Function:client/org/apache/pulsar/functions/api/Function} interface.
This interface works much like the `java.util.Function` interface, but with the
important difference that it provides a {@inject:
javadoc:Context:/client/org/apache/pulsar/functions/api/Context} object that
you can use in a [variety of ways](#context)
+* The {@inject:
javadoc:Function:/pulsar-functions/org/apache/pulsar/functions/api/Function}
interface. This interface works much like the `java.util.Function` interface,
but with the important difference that it provides a {@inject:
javadoc:Context:/pulsar-functions/org/apache/pulsar/functions/api/Context}
object that you can use in a [variety of ways](#context)
### Getting started
-In order to write Pulsar Functions in Java, you'll need to install the proper
[dependencies](#java-dependencies) and package your function [as a
JAR](#java-packaging).
+In order to write Pulsar Functions in Java, you'll need to install the proper
[dependencies](#dependencies) and package your function [as a JAR](#packaging).
#### Dependencies
How you get started writing Pulsar Functions in Java depends on which API
you're using:
-* If you're writing a [Java native function](#java-native), you won't need any
external dependencies.
-* If you're writing a [Java SDK function](#java-sdk), you'll need to import
the `pulsar-functions-api` library.
+* If you're writing a [Java native function](#java-native-functions), you
won't need any external dependencies.
+* If you're writing a [Java SDK function](#java-sdk-functions), you'll need to
import the `pulsar-functions-api` library.
Here's an example for a Maven `pom.xml` configuration file:
@@ -177,14 +178,14 @@ How you get started writing Pulsar Functions in Java
depends on which API you're
#### Packaging
-Whether you're writing Java Pulsar Functions using the [native](#java-native)
Java `java.util.Function` interface or using the [Java SDK](#java-sdk), you'll
need to package your function(s) as a "fat" JAR.
+Whether you're writing Java Pulsar Functions using the
[native](#java-native-functions) Java `java.util.Function` interface or using
the [Java SDK](#java-sdk-functions), you'll need to package your function(s) as
a "fat" JAR.
> #### Starter repo
> If you'd like to get up and running quickly, you can use [this
> repo](https://github.com/streamlio/pulsar-functions-java-starter), which
> contains the necessary Maven configuration to build a fat JAR as well as
> some example functions.
### Java native functions
-If your function doesn't require access to its [context](#java-context), you
can create a Pulsar Function by implementing the
[`java.util.Function`](https://docs.oracle.com/javase/8/docs/api/java/util/function/Function.html)
interface, which has this very simple, single-method signature:
+If your function doesn't require access to its [context](#context), you can
create a Pulsar Function by implementing the
[`java.util.Function`](https://docs.oracle.com/javase/8/docs/api/java/util/function/Function.html)
interface, which has this very simple, single-method signature:
```java
public interface Function<I, O> {
@@ -205,29 +206,29 @@ public class ExclamationFunction implements
Function<String, String> {
}
```
-In general, you should use native functions when you don't need access to the
function's [context](#context). If you *do* need access to the function's
context, then we recommend using the [Pulsar Functions Java SDK](#java-sdk).
+In general, you should use native functions when you don't need access to the
function's [context](#context). If you *do* need access to the function's
context, then we recommend using the [Pulsar Functions Java
SDK](#java-sdk-functions).
#### Java native examples
-There is one example Java native function in [this
folder](https://github.com/apache/incubator-pulsar/tree/master/pulsar-functions/java-examples/src/main/java/org/apache/pulsar/functions/api/examples):
+There is one example Java native function in this {@inject:
github:folder:/pulsar-functions/java-examples/src/main/java/org/apache/pulsar/functions/api/examples}:
-*
[`JavaNativeExclamationFunction`](https://github.com/apache/incubator-pulsar/blob/master/pulsar-functions/java-examples/src/main/java/org/apache/pulsar/functions/api/examples/JavaNativeExclamationFunction.java)
+* {@inject:
github:`JavaNativeExclamationFunction`:/pulsar-functions/java-examples/src/main/java/org/apache/pulsar/functions/api/examples/JavaNativeExclamationFunction.java}
### Java SDK functions
-To get started developing Pulsar Functions using the Java SDK, you'll need to
add a dependency on the `pulsar-functions-api` artifact to your project.
Instructions can be found [above](#java-dependencies).
+To get started developing Pulsar Functions using the Java SDK, you'll need to
add a dependency on the `pulsar-functions-api` artifact to your project.
Instructions can be found [above](#dependencies).
> An easy way to get up and running with Pulsar Functions in Java is to clone
> the
> [`pulsar-functions-java-starter`](https://github.com/streamlio/pulsar-functions-java-starter)
> repo and follow the instructions there.
#### Java SDK examples
-There are several example Java SDK functions in [this
folder](https://github.com/apache/incubator-pulsar/tree/master/pulsar-functions/java-examples/src/main/java/org/apache/pulsar/functions/api/examples):
+There are several example Java SDK functions in this {@inject:
github:folder:/pulsar-functions/java-examples/src/main/java/org/apache/pulsar/functions/api/examples}:
Function name | Description
:-------------|:-----------
[`ContextFunction`](https://github.com/apache/incubator-pulsar/blob/master/pulsar-functions/java-examples/src/main/java/org/apache/pulsar/functions/api/examples/ContextFunction.java)
| Illustrates [context](#context)-specific functionality like
[logging](#java-logging) and [metrics](#java-metrics)
-[`CounterFunction`](https://github.com/apache/incubator-pulsar/blob/master/pulsar-functions/java-examples/src/main/java/org/apache/pulsar/functions/api/examples/CounterFunction.java)
| Illustrates usage of Pulsar Function
[counters](functions-overview.md#counters)
+[`WordCountFunction`](https://github.com/apache/incubator-pulsar/blob/master/pulsar-functions/java-examples/src/main/java/org/apache/pulsar/functions/api/examples/WordCountFunction.java)
| Illustrates usage of Pulsar Function
[state-storage](functions-overview.md#state-storage)
[`ExclamationFunction`](https://github.com/apache/incubator-pulsar/blob/master/pulsar-functions/java-examples/src/main/java/org/apache/pulsar/functions/api/examples/ExclamationFunction.java)
| A basic string manipulation function for the Java SDK
[`LoggingFunction`](https://github.com/apache/incubator-pulsar/blob/master/pulsar-functions/java-examples/src/main/java/org/apache/pulsar/functions/api/examples/LoggingFunction.java)
| A function that shows how [logging](#java-logging) works for Java
[`PublishFunction`](https://github.com/apache/incubator-pulsar/blob/master/pulsar-functions/java-examples/src/main/java/org/apache/pulsar/functions/api/examples/PublishFunction.java)
| Publishes results to a topic specified in the function's [user
config](#java-user-config) (rather than on the function's output topic)
@@ -312,7 +313,7 @@ public class LogFunction implements PulsarFunction<String,
Void> {
### Java SerDe
-Pulsar Functions use [SerDe](#serde) when publishing data to and consuming
data from Pulsar topics. When you're writing Pulsar Functions in Java, the
following basic Java types are built in and supported by default:
+Pulsar Functions use [SerDe](#serialization-and-deserialization-serde) when
publishing data to and consuming data from Pulsar topics. When you're writing
Pulsar Functions in Java, the following basic Java types are built in and
supported by default:
* `String`
* `Double`
@@ -376,7 +377,7 @@ To apply this custom SerDe to a particular Pulsar Function,
you would need to:
* Package the `Tweet` and `TweetSerde` classes into a JAR
* Specify a path to the JAR and SerDe class name when deploying the function
-Here's an example
[`create`](reference-pulsar-admin.md#pulsar-admin-functions-create) operation:
+Here's an example [`create`](reference-pulsar-admin.md#create-1) operation:
```bash
$ bin/pulsar-admin functions create \
@@ -390,7 +391,7 @@ $ bin/pulsar-admin functions create \
### Java logging
-Pulsar Functions that use the [Java SDK](#java-sdk) have access to an
[SLF4j](https://www.slf4j.org/)
[`Logger`](https://www.slf4j.org/api/org/apache/log4j/Logger.html) object that
can be used to produce logs at the chosen log level. Here's a simple example
function that logs either a `WARNING`- or `INFO`-level log based on whether the
incoming string contains the word `danger`:
+Pulsar Functions that use the [Java SDK](#java-sdk-functions) have access to
an [SLF4j](https://www.slf4j.org/)
[`Logger`](https://www.slf4j.org/api/org/apache/log4j/Logger.html) object that
can be used to produce logs at the chosen log level. Here's a simple example
function that logs either a `WARNING`- or `INFO`-level log based on whether the
incoming string contains the word `danger`:
```java
import org.apache.pulsar.functions.api.Context;
@@ -428,7 +429,7 @@ Now, all logs produced by the `LoggingFunction` above can
be accessed via the `p
### Java user config
-The Java SDK's [`Context`](#java-context) object enables you to access
key/value pairs provided to the Pulsar Function via the command line (as JSON).
Here's an example function creation command that passes a key/value pair:
+The Java SDK's [`Context`](#context) object enables you to access key/value
pairs provided to the Pulsar Function via the command line (as JSON). Here's an
example function creation command that passes a key/value pair:
```bash
$ bin/pulsar-admin functions create \
@@ -476,7 +477,7 @@ String wotd =
context.getUserConfigValueOrDefault("word-of-the-day", "perspicaci
### Java metrics
-You can record metrics using the [`Context`](#java-context) object on a
per-key basis. You can, for example, set a metric for the key `process-count`
and a different metric for the key `elevens-count` every time the function
processes a message. Here's an example:
+You can record metrics using the [`Context`](#context) object on a per-key
basis. You can, for example, set a metric for the key `process-count` and a
different metric for the key `elevens-count` every time the function processes
a message. Here's an example:
```java
import org.apache.pulsar.functions.api.Context;
@@ -506,7 +507,7 @@ public class MetricRecorderFunction implements
Function<Integer, Void> {
Writing Pulsar Functions in Python entails implementing one of two things:
* A `process` function that takes an input (message data from the function's
input topic(s)), applies some kind of logic to it, and either returns an object
(to be published to the function's output topic) or `pass`es and thus doesn't
produce a message
-* A `Function` class that has a `process` method that provides a message input
to process and a [context](#python-context) object
+* A `Function` class that has a `process` method that provides a message input
to process and a [context](#context) object
### Getting started
@@ -518,7 +519,7 @@ Regardless of which [deployment
mode](functions-deploying.md) you're using, you'
* grpcio
* grpcio-tools
-That could be your local machine for [local run
mode](functions-deploying.md#local-run) or a machine running a Pulsar
[broker](reference-terminology.md#broker) for [cluster
mode](functions-deploying.md#cluster-mode). To install those libraries using
pip:
+That could be your local machine for [local run
mode](functions-deploying.md#local-run-mode) or a machine running a Pulsar
[broker](reference-terminology.md#broker) for [cluster
mode](functions-deploying.md#cluster-mode). To install those libraries using
pip:
```bash
$ pip install pulsar-client protobuf futures grpcio grpcio-tools
@@ -537,13 +538,13 @@ def process(input):
return "{0}!".format(input)
```
-In general, you should use native functions when you don't need access to the
function's [context](#context). If you *do* need access to the function's
context, then we recommend using the [Pulsar Functions Python SDK](#python-sdk).
+In general, you should use native functions when you don't need access to the
function's [context](#context). If you *do* need access to the function's
context, then we recommend using the [Pulsar Functions Python
SDK](#python-sdk-functions).
#### Python native examples
-There is one example Python native function in [this
folder](https://github.com/apache/incubator-pulsar/tree/master/pulsar-functions/python-examples):
+There is one example Python native function in this {@inject:
github:folder:/pulsar-functions/python-examples}:
-*
[`native_exclamation_function.py`](https://github.com/apache/incubator-pulsar/blob/master/pulsar-functions/python-examples/native_exclamation_function.py)
+* {@inject:
github:`native_exclamation_function.py`:/pulsar-functions/python-examples/native_exclamation_function.py}
### Python SDK functions
@@ -551,7 +552,7 @@ To get started developing Pulsar Functions using the Python
SDK, you'll need to
#### Python SDK examples
-There are several example Python functions in [this
folder](https://github.com/apache/incubator-pulsar/blob/master/pulsar-functions/python-examples):
+There are several example Python functions in this {@inject:
github:folder:/pulsar-functions/python-examples}:
Function file | Description
:-------------|:-----------
@@ -581,7 +582,7 @@ Method | What it provides
### Python SerDe
-Pulsar Functions use [SerDe](#serde) when publishing data to and consuming
data from Pulsar topics (this is true of both [native](#python-native)
functions and [SDK](#python-sdk) functions). You can specify the SerDe when
[creating](functions-deploying.md#cluster-mode) or
[running](functions-deploying.md#local-run) functions. Here's an example:
+Pulsar Functions use [SerDe](#serialization-and-deserialization-serde) when
publishing data to and consuming data from Pulsar topics (this is true of both
[native](#python-native-functions) functions and [SDK](#python-sdk-functions)
functions). You can specify the SerDe when
[creating](functions-deploying.md#cluster-mode) or
[running](functions-deploying.md#local-run-mode) functions. Here's an example:
```bash
$ bin/pulsar-admin functions create \
@@ -644,7 +645,7 @@ In order to use this class in Pulsar Functions, you'd have
two options:
### Python logging
-Pulsar Functions that use the [Python SDK](#python-sdk) have access to a
logging object that can be used to produce logs at the chosen log level. Here's
a simple example function that logs either a `WARNING`- or `INFO`-level log
based on whether the incoming string contains the word `danger`:
+Pulsar Functions that use the [Python SDK](#python-sdk-functions) have access
to a logging object that can be used to produce logs at the chosen log level.
Here's a simple example function that logs either a `WARNING`- or `INFO`-level
log based on whether the incoming string contains the word `danger`:
```python
from pulsar import Function
@@ -673,7 +674,7 @@ Now, all logs produced by the `LoggingFunction` above can
be accessed via the `l
### Python user config
-The Python SDK's [`Context`](#python-context) object enables you to access
key/value pairs provided to the Pulsar Function via the command line (as JSON).
Here's an example function creation command that passes a key/value pair:
+The Python SDK's [`Context`](#context) object enables you to access key/value
pairs provided to the Pulsar Function via the command line (as JSON). Here's an
example function creation command that passes a key/value pair:
```bash
$ bin/pulsar-admin functions create \
@@ -698,7 +699,7 @@ class UserConfigFunction(Function):
### Python metrics
-You can record metrics using the [`Context`](#python-context) object on a
per-key basis. You can, for example, set a metric for the key `process-count`
and a different metric for the key `elevens-count` every time the function
processes a message. Here's an example:
+You can record metrics using the [`Context`](#context) object on a per-key
basis. You can, for example, set a metric for the key `process-count` and a
different metric for the key `elevens-count` every time the function processes
a message. Here's an example:
```python
from pulsar import Function
diff --git a/site2/docs/functions-deploying.md
b/site2/docs/functions-deploying.md
index a4964345a4..dcf2d3b753 100644
--- a/site2/docs/functions-deploying.md
+++ b/site2/docs/functions-deploying.md
@@ -25,7 +25,7 @@ If you're running a
non-[standalone](reference-terminology.md#standalone) cluste
## Command-line interface
-Pulsar Functions are deployed and managed using the [`pulsar-admin
functions`](reference-pulsar-admin.md#functions) interface, which contains
commands such as [`create`](reference-pulsar-admin.md#functions-create) for
deploying functions in [cluster mode](#cluster-mode),
[`trigger`](reference-pulsar-admin.md#functions-trigger) for
[triggering](#triggering) functions,
[`list`](reference-pulsar-admin.md#functions-list) for listing deployed
functions, and several others.
+Pulsar Functions are deployed and managed using the [`pulsar-admin
functions`](reference-pulsar-admin.md#functions) interface, which contains
commands such as [`create`](reference-pulsar-admin.md#functions-create) for
deploying functions in [cluster mode](#cluster-mode),
[`trigger`](reference-pulsar-admin.md#trigger) for
[triggering](#triggering-pulsar-functions) functions,
[`list`](reference-pulsar-admin.md#list-2) for listing deployed functions, and
several others.
### Fully Qualified Function Name (FQFN)
@@ -47,8 +47,8 @@ Function name | Whichever value is specified for the class
name (minus org, libr
Tenant | Derived from the input topics' names. If the input topics are under
the `marketing` tenant---i.e. the topic names have the form
`persistent://marketing/{namespace}/{topicName}`---then the tenant will be
`marketing`.
Namespace | Derived from the input topics' names. If the input topics are
under the `asia` namespace under the `marketing` tenant---i.e. the topic names
have the form `persistent://marketing/asia/{topicName}`, then the namespace
will be `asia`.
Output topic | `{input topic}-{function name}-output`. A function with an
input topic name of `incoming` and a function name of `exclamation`, for
example, would have an output topic of `incoming-exclamation-output`.
-Subscription type | For at-least-once and at-most-once [processing
guarantees](functions-gaurantees.md), the
[`SHARED`](concepts-messaging.md#shared) is applied by default; for
effectively-once guarantees, [`FAILOVER`](concepts-messaging.md#failover) is
applied
-Processing guarantees | [`ATLEAST_ONCE`](functions-gaurantees.md)
+Subscription type | For at-least-once and at-most-once [processing
guarantees](functions-guarantees.md), the
[`SHARED`](concepts-messaging.md#shared) is applied by default; for
effectively-once guarantees, [`FAILOVER`](concepts-messaging.md#failover) is
applied
+Processing guarantees | [`ATLEAST_ONCE`](functions-guarantees.md)
Pulsar service URL | `pulsar://localhost:6650`
#### Example use of defaults
@@ -66,7 +66,7 @@ The created function would have default values supplied for
the function name (`
## Local run mode
-If you run a Pulsar Function in **local run** mode, it will run on the machine
from which the command is run (this could be your laptop, an [AWS
EC2](https://aws.amazon.com/ec2/) instance, etc.). Here's an example
[`localrun`](reference-pulsar-admin.md#functions-localrun) command:
+If you run a Pulsar Function in **local run** mode, it will run on the machine
from which the command is run (this could be your laptop, an [AWS
EC2](https://aws.amazon.com/ec2/) instance, etc.). Here's an example
[`localrun`](reference-pulsar-admin.md#localrun) command:
```bash
$ bin/pulsar-admin functions localrun \
@@ -86,7 +86,7 @@ $ bin/pulsar-admin functions localrun \
## Cluster mode
-When you run a Pulsar Function in **cluster mode**, the function code will be
uploaded to a Pulsar broker and run *alongside the broker* rather than in your
[local environment](#local-run). You can run a function in cluster mode using
the [`create`](reference-pulsar-admin.md#functions-create) command. Here's an
example:
+When you run a Pulsar Function in **cluster mode**, the function code will be
uploaded to a Pulsar broker and run *alongside the broker* rather than in your
[local environment](#local-run-mode). You can run a function in cluster mode
using the [`create`](reference-pulsar-admin.md#create-1) command. Here's an
example:
```bash
$ bin/pulsar-admin functions create \
@@ -98,7 +98,7 @@ $ bin/pulsar-admin functions create \
### Updating cluster mode functions
-You can use the [`update`](reference-pulsar-admin.md#functions-update) command
to update a Pulsar Function running in cluster mode. This command, for example,
would update the function created in the section [above](#cluster-mode):
+You can use the [`update`](reference-pulsar-admin.md#update-1) command to
update a Pulsar Function running in cluster mode. This command, for example,
would update the function created in the section [above](#cluster-mode):
```bash
$ bin/pulsar-admin functions update \
@@ -110,7 +110,7 @@ $ bin/pulsar-admin functions update \
### Parallelism
-Pulsar Functions run as processes called **instances**. When you run a Pulsar
Function, it runs as a single instance by default (and in [local run
mode](#local-run) you can *only* run a single instance of a function).
+Pulsar Functions run as processes called **instances**. When you run a Pulsar
Function, it runs as a single instance by default (and in [local run
mode](#local-run-mode) you can *only* run a single instance of a function).
You can also specify the *parallelism* of a function, i.e. the number of
instances to run, when you create the function. You can set the parallelism
factor using the `--parallelism` flag of the
[`create`](reference-pulsar-admin.md#functions-create) command. Here's an
example:
@@ -120,7 +120,7 @@ $ bin/pulsar-admin functions create \
# Other function info
```
-You can adjust the parallelism of an already created function using the
[`update`](reference-pulsar-admin.md#functions-update) interface.
+You can adjust the parallelism of an already created function using the
[`update`](reference-pulsar-admin.md#update-1) interface.
```bash
$ bin/pulsar-admin functions update \
@@ -148,7 +148,7 @@ $ bin/pulsar-admin functions update \
### Function instance resources
-When you run Pulsar Functions in [cluster run](#cluster-run) mode, you can
specify the resources that are assigned to each function
[instance](#parallelism):
+When you run Pulsar Functions in [cluster run](#cluster-mode) mode, you can
specify the resources that are assigned to each function
[instance](#parallelism):
Resource | Specified as... | Runtimes
:--------|:----------------|:--------
@@ -174,9 +174,9 @@ $ bin/pulsar-admin functions create \
If a Pulsar Function is running in [cluster mode](#cluster-mode), you can
**trigger** it at any time using the command line. Triggering a function means
that you send a message with a specific value to the function and get the
function's output (if any) via the command line.
-> Triggering a function is ultimately no different from invoking a function by
producing a message on one of the function's input topics. The [`pulsar-admin
functions trigger`](reference-pulsar-admin.md#functions-trigger) command is
essentially a convenient mechanism for sending messages to functions without
needing to use the [`pulsar-client`](reference-cli-tools.md#pulsar-client) tool
or a language-specific client library.
+> Triggering a function is ultimately no different from invoking a function by
producing a message on one of the function's input topics. The [`pulsar-admin
functions trigger`](reference-pulsar-admin.md#trigger) command is essentially a
convenient mechanism for sending messages to functions without needing to use
the [`pulsar-client`](reference-cli-tools.md#pulsar-client) tool or a
language-specific client library.
-To show an example of function triggering, let's start with a simple [Python
function](functions-api.md#python) that returns a simple string based on the
input:
+To show an example of function triggering, let's start with a simple [Python
function](functions-api.md#functions-for-python) that returns a simple string
based on the input:
```python
# myfunc.py
@@ -184,7 +184,7 @@ def process(input):
return "This function has been triggered with a value of {0}".format(input)
```
-Let's run that function in [local run mode](functions-deploying.md#local-run):
+Let's run that function in [local run
mode](functions-deploying.md#local-run-mode):
```bash
$ bin/pulsar-admin functions create \
@@ -197,7 +197,7 @@ $ bin/pulsar-admin functions create \
--output persistent://public/default/out
```
-Now let's make a consumer listen on the output topic for messages coming from
the `myfunc` function using the [`pulsar-client
consume`](reference-cli-tools.md#pulsar-client-consume) command:
+Now let's make a consumer listen on the output topic for messages coming from
the `myfunc` function using the [`pulsar-client
consume`](reference-cli-tools.md#consume) command:
```bash
$ bin/pulsar-client consume persistent://public/default/out \
@@ -224,15 +224,3 @@ This function has been triggered with a value of hello
world
> #### Topic info not required
> In the `trigger` command above, you may have noticed that you only need to
> specify basic information about the function (tenant, namespace, and name).
> To trigger the function, you didn't need to know the function's input
> topic(s).
-
-<!--
-## Subscription types
-
-Pulsar supports three different [subscription
types](concepts-messaging.md#subscription-modes) (or subscription modes) for
Pulsar clients:
-
-* With [exclusive](concepts-messaging.md#exclusive) subscriptions, only a
single [consumer](reference-terminology.md#consumer) is allowed to attach to
the subscription.
-* With [shared](concepts-messaging.md#shared) . Please note that strict
message ordering is *not* guaranteed with shared subscriptions.
-* With [failover](concepts-messaging.md#failover) subscriptions
-
-Pulsar Functions can also be assigned a subscription type when you
[create](#cluster-mode) them or run them [locally](#local-run). In cluster
mode, the subscription can also be [updated](#updating) after the function has
been created.
--->
diff --git a/site2/docs/functions-gaurantees.md
b/site2/docs/functions-guarantees.md
similarity index 88%
rename from site2/docs/functions-gaurantees.md
rename to site2/docs/functions-guarantees.md
index a32efb10b4..1c731eb864 100644
--- a/site2/docs/functions-gaurantees.md
+++ b/site2/docs/functions-guarantees.md
@@ -14,7 +14,7 @@ Delivery semantics | Description
## Applying processing guarantees to a function
-You can set the processing guarantees for a Pulsar Function when you create
the Function. This [`pulsar-function
create`](reference-pulsar-admin.md#pulsar-admin-functions-create) command, for
example, would apply effectively-once guarantees to the Function:
+You can set the processing guarantees for a Pulsar Function when you create
the Function. This [`pulsar-function
create`](reference-pulsar-admin.md#create-1) command, for example, would apply
effectively-once guarantees to the Function:
```bash
$ bin/pulsar-admin functions create \
@@ -32,7 +32,7 @@ The available options are:
## Updating the processing guarantees of a function
-You can change the processing guarantees applied to a function once it's
already been created using the
[`update`](reference-pulsar-admin.md#pulsar-admin-functions-update) command.
Here's an example:
+You can change the processing guarantees applied to a function once it's
already been created using the [`update`](reference-pulsar-admin.md#update-1)
command. Here's an example:
```bash
$ bin/pulsar-admin functions update \
diff --git a/site2/docs/functions-metrics.md b/site2/docs/functions-metrics.md
index 896a401ad3..8ac307f53d 100644
--- a/site2/docs/functions-metrics.md
+++ b/site2/docs/functions-metrics.md
@@ -7,7 +7,7 @@ sidebar_label: Metrics
Pulsar Functions can publish arbitrary metrics to the metrics interface which
can then be queried. This doc contains instructions for publishing metrics
using the [Java](#java-sdk) and [Python](#python-sdk) Pulsar Functions SDKs.
> #### Metrics and stats not available through language-native interfaces
-> If a Pulsar Function uses the language-native interface for
[Java](functions-api.md#java-native) or [Python](#python-native), that function
will not be able to publish metrics and stats to Pulsar.
+> If a Pulsar Function uses the language-native interface for
[Java](functions-api.md#java-native-functions) or
[Python](#python-native-functions), that function will not be able to publish
metrics and stats to Pulsar.
## Accessing metrics
@@ -15,7 +15,7 @@ For a guide to accessing metrics created by Pulsar Functions,
see the guide to [
## Java SDK
-If you're creating a Pulsar Function using the [Java
SDK](functions-api.md#java-sdk), the {@inject:
javadoc:Context:/client/org/apache/pulsar/functions/api/Context} object has a
`recordMetric` method that you can use to register both a name for the metric
and a value. Here's the signature for that method:
+If you're creating a Pulsar Function using the [Java
SDK](functions-api.md#java-sdk-functions), the {@inject:
javadoc:Context:/pulsar-functions/org/apache/pulsar/functions/api/Context}
object has a `recordMetric` method that you can use to register both a name for
the metric and a value. Here's the signature for that method:
```java
void recordMetric(String metricName, double value);
@@ -40,4 +40,4 @@ This function counts the length of each incoming message (of
type `String`) and
## Python SDK
-Documentation for the [Python SDK](functions-api.md#python-sdk) is coming soon.
+Documentation for the [Python SDK](functions-api.md#python-sdk-functions) is
coming soon.
diff --git a/site2/docs/functions-overview.md b/site2/docs/functions-overview.md
index da2591dba4..2bc1e3651e 100644
--- a/site2/docs/functions-overview.md
+++ b/site2/docs/functions-overview.md
@@ -10,7 +10,7 @@ sidebar_label: Overview
* apply a user-supplied processing logic to each message,
* publish the results of the computation to another topic
-Here's an example Pulsar Function for Java (using the [native
interface](functions-api.md#java-native)):
+Here's an example Pulsar Function for Java (using the [native
interface](functions-api.md#java-native-functions)):
```java
import java.util.Function;
@@ -21,7 +21,7 @@ public class ExclamationFunction implements Function<String,
String> {
}
```
-Here's an equivalent function in Python (also using the [native
interface](functions-api.md#python-native)):
+Here's an equivalent function in Python (also using the [native
interface](functions-api.md#python-native-functions)):
```python
def process(input):
@@ -34,7 +34,7 @@ Functions are executed each time a message is published to
the input topic. If a
The core goal behind Pulsar Functions is to enable you to easily create
processing logic of any level of complexity without needing to deploy a
separate neighboring system (such as [Apache Storm](http://storm.apache.org/),
[Apache Heron](https://apache.github.io/incubator-heron), [Apache
Flink](https://flink.apache.org/), etc.). Pulsar Functions is essentially
ready-made compute infrastructure at your disposal as part of your Pulsar
messaging system. This core goal is tied to a series of other goals:
-* Developer productivity ([language-native](#native) vs. [Pulsar Functions
SDK](#sdk) functions)
+* Developer productivity ([language-native](#language-native-functions) vs.
[Pulsar Functions SDK](#the-pulsar-functions-sdk) functions)
* Easy troubleshooting
* Operational simplicity (no need for an external processing system)
@@ -54,12 +54,12 @@ Pulsar Functions could be described as
The core programming model behind Pulsar Functions is very simple:
-* Functions receive messages from one or more **input
[topics](reference-teminology.md#topic)**. Every time a message is received,
the function can do a variety of things:
+* Functions receive messages from one or more **input
[topics](reference-terminology.md#topic)**. Every time a message is received,
the function can do a variety of things:
* Apply some processing logic to the input and write output to:
* An **output topic** in Pulsar
* [Apache BookKeeper](#state-storage)
* Write logs to a **log topic** (potentially for debugging purposes)
- * Increment a [counter](#counters)
+ * Increment a [counter](#word-count-example)

@@ -69,7 +69,7 @@ If you were to implement the classic word count example using
Pulsar Functions,

-If you were writing the function in [Java](functions-api.md#java) using the
[Pulsar Functions SDK for Java](functions-api.md#java-sdk), you could write the
function like this...
+If you were writing the function in
[Java](functions-api.md#functions-for-java) using the [Pulsar Functions SDK for
Java](functions-api.md#java-sdk-functions), you could write the function like
this...
```java
package org.example.functions;
@@ -92,7 +92,7 @@ public class WordCountFunction implements Function<String,
Void> {
}
```
-...and then [deploy it](#cluster-mode) in your Pulsar cluster using the
[command line](#cli) like this:
+...and then [deploy it](#cluster-run-mode) in your Pulsar cluster using the
[command line](#command-line-interface) like this:
```bash
$ bin/pulsar-admin functions create \
@@ -141,7 +141,7 @@ class RoutingFunction(Function):
## Command-line interface
-Pulsar Functions are managed using the
[`pulsar-admin`](reference-pulsar-admin.md) CLI tool (in particular the
[`functions`](reference-pulsar-admin.md#pulsar-admin-functions) command).
Here's an example command that would run a function in [local run
mode](#local-run):
+Pulsar Functions are managed using the
[`pulsar-admin`](reference-pulsar-admin.md) CLI tool (in particular the
[`functions`](reference-pulsar-admin.md#functions) command). Here's an example
command that would run a function in [local run mode](#local-run-mode):
```bash
$ bin/pulsar-functions localrun \
@@ -165,7 +165,7 @@ FQFNs enable you to, for example, create multiple functions
with the same name p
Pulsar Functions can be configured in two ways:
-* Via [command-line arguments](#cli) passed to the [`pulsar-admin
functions`](reference-pulsar-admin.md#pulsar-admin-functions) interface
+* Via [command-line arguments](#command-line-interface) passed to the
[`pulsar-admin functions`](reference-pulsar-admin.md#functions) interface
* Via [YAML](http://yaml.org/) configuration files
If you're supplying a YAML configuration, you must specify a path to the file
on the command line. Here's an example:
@@ -192,7 +192,7 @@ You can also mix and match configuration methods by
specifying some function att
## Supported languages
-Pulsar Functions can currently be written in [Java](functions-api.md#java) and
[Python](functions-api.md#python). Support for additional languages is coming
soon.
+Pulsar Functions can currently be written in
[Java](functions-api.md#functions-for-java) and
[Python](functions-api.md#functions-for-python). Support for additional
languages is coming soon.
## The Pulsar Functions API
@@ -203,12 +203,12 @@ The Pulsar Functions API enables you to create processing
logic that is:
### Function context
-Each Pulsar Function created using the [Pulsar Functions SDK](#sdk) has access
to a context object that both provides:
+Each Pulsar Function created using the [Pulsar Functions
SDK](#the-pulsar-functions-sdk) has access to a context object that both
provides:
1. A wide variety of information about the function, including:
* The name of the function
* The tenant and namespace of the function
- * [User-supplied configuration](#user-config) values
+ * [User-supplied configuration](#user-configuration) values
2. Special functionality, including:
* The ability to produce [logs](#logging) to a specified logging topic
* The ability to produce [metrics](#metrics)
@@ -217,11 +217,11 @@ Each Pulsar Function created using the [Pulsar Functions
SDK](#sdk) has access t
Both Java and Python support writing "native" functions, i.e. Pulsar Functions
with no dependencies.
-The benefit of native functions is that they don't have any dependencies
beyond what's already available in Java/Python "out of the box." The downside
is that they don't provide access to the function's [context](#context), which
is necessary for a variety of functionality, including [logging](#logging),
[user configuration](#user-config), and more.
+The benefit of native functions is that they don't have any dependencies
beyond what's already available in Java/Python "out of the box." The downside
is that they don't provide access to the function's
[context](#function-context), which is necessary for a variety of
functionality, including [logging](#logging), [user
configuration](#user-configuration), and more.
## The Pulsar Functions SDK
-If you'd like a Pulsar Function to have access to a [context
object](#context), you can use the **Pulsar Functions SDK**, available for both
[Java](functions-api.md#java-sdk) and [Pythnon](functions-api.md#python-sdk).
+If you'd like a Pulsar Function to have access to a [context
object](#function-context), you can use the **Pulsar Functions SDK**, available
for both [Java](functions-api.md#functions-for-java) and
[Pythnon](functions-api.md#functions-for-python).
### Java
@@ -267,12 +267,12 @@ The Pulsar Functions feature was built to support a
variety of deployment option
Deployment mode | Description
:---------------|:-----------
-[Local run mode](#local-run) | The function runs in your local environment,
for example on your laptop
-[Cluster mode](#cluster-run) | The function runs *inside of* your Pulsar
cluster, on the same machines as your Pulsar
[brokers](reference-terminology.md#broker)
+[Local run mode](#local-run-mode) | The function runs in your local
environment, for example on your laptop
+[Cluster mode](#cluster-run-mode) | The function runs *inside of* your Pulsar
cluster, on the same machines as your Pulsar
[brokers](reference-terminology.md#broker)
### Local run mode
-If you run a Pulsar Function in **local run** mode, it will run on the machine
from which the command is run (this could be your laptop, an [AWS
EC2](https://aws.amazon.com/ec2/) instance, etc.). Here's an example
[`localrun`](reference-pulsar-admin.md#pulsar-admin-functions-localrun) command:
+If you run a Pulsar Function in **local run** mode, it will run on the machine
from which the command is run (this could be your laptop, an [AWS
EC2](https://aws.amazon.com/ec2/) instance, etc.). Here's an example
[`localrun`](reference-pulsar-admin.md#localrun) command:
```bash
$ bin/pulsar-admin functions localrun \
@@ -292,7 +292,7 @@ $ bin/pulsar-admin functions localrun \
### Cluster run mode
-When you run a Pulsar Function in **cluster mode**, the function code will be
uploaded to a Pulsar broker and run *alongside the broker* rather than in your
[local environment](#local-run). You can run a function in cluster mode using
the [`create`](reference-pulsar-admin.md#pulsar-admin-functions-create)
command. Here's an example:
+When you run a Pulsar Function in **cluster mode**, the function code will be
uploaded to a Pulsar broker and run *alongside the broker* rather than in your
[local environment](#local-run-mode). You can run a function in cluster mode
using the [`create`](reference-pulsar-admin.md#create-1) command. Here's an
example:
```bash
$ bin/pulsar-admin functions create \
@@ -306,7 +306,7 @@ This command will upload `myfunc.py` to Pulsar, which will
use the code to start
### Parallelism
-By default, only one **instance** of a Pulsar Function runs when you create
and run it in [cluster run mode](#cluster-run). You can also, however, run
multiple instances in parallel. You can specify the number of instances when
you create the function, or update an existing single-instance function with a
new parallelism factor.
+By default, only one **instance** of a Pulsar Function runs when you create
and run it in [cluster run mode](#cluster-run-mode). You can also, however, run
multiple instances in parallel. You can specify the number of instances when
you create the function, or update an existing single-instance function with a
new parallelism factor.
This command, for example, would create and run a function with a parallelism
of 5 (i.e. 5 instances):
@@ -322,7 +322,7 @@ $ bin/pulsar-admin functions create \
### Function instance resources
-When you run Pulsar Functions in [cluster run](#cluster-run) mode, you can
specify the resources that are assigned to each function
[instance](#parallelism):
+When you run Pulsar Functions in [cluster run](#cluster-run-mode) mode, you
can specify the resources that are assigned to each function
[instance](#parallelism):
Resource | Specified as... | Runtimes
:--------|:----------------|:--------
@@ -345,7 +345,7 @@ For more information on resources, see the [Deploying and
Managing Pulsar Functi
### Logging
-Pulsar Functions created using the [Pulsar Functions SDK(#sdk) can send logs
to a log topic that you specify as part of the function's configuration. The
function created using the command below, for example, would produce all logs
on the `persistent://public/default/my-func-1-log` topic:
+Pulsar Functions created using the [Pulsar Functions
SDK](#the-pulsar-functions-sdk) can send logs to a log topic that you specify
as part of the function's configuration. The function created using the command
below, for example, would produce all logs on the
`persistent://public/default/my-func-1-log` topic:
```bash
$ bin/pulsar-admin functions create \
@@ -398,11 +398,11 @@ public class ConfigMapFunction implements
Function<String, Void> {
### Triggering Pulsar Functions
-Pulsar Functions running in [cluster mode](#cluster-mode) can be
[triggered](functions-deploying.md#triggering) via the [command line](#cli).
With triggering you can easily pass a specific value to a function and get the
function's return value *without* needing to worry about creating a client,
sending a message to the right input topic, etc. Triggering can be very useful
for---but is by no means limited to---testing and debugging purposes.
+Pulsar Functions running in [cluster mode](#cluster-run-mode) can be
[triggered](functions-deploying.md#triggering-pulsar-functions) via the
[command line](#command-line-interface). With triggering you can easily pass a
specific value to a function and get the function's return value *without*
needing to worry about creating a client, sending a message to the right input
topic, etc. Triggering can be very useful for---but is by no means limited
to---testing and debugging purposes.
-> Triggering a function is ultimately no different from invoking a function by
producing a message on one of the function's input topics. The [`pulsar-admin
functions trigger`](reference-pulsar-admin.md#pulsar-admin-functions-trigger)
command is essentially a convenient mechanism for sending messages to functions
without needing to use the
[`pulsar-client`](reference-pulsar-admin.md#pulsar-client) tool or a
language-specific client library.
+> Triggering a function is ultimately no different from invoking a function by
producing a message on one of the function's input topics. The [`pulsar-admin
functions trigger`](reference-pulsar-admin.md#trigger) command is essentially a
convenient mechanism for sending messages to functions without needing to use
the [`pulsar-client`](reference-cli-tools.md#pulsar-client) tool or a
language-specific client library.
-Let's take an example Pulsar Function written in Python (using the [native
interface](functions-api.md#python-native)) that simply reverses string inputs:
+Let's take an example Pulsar Function written in Python (using the [native
interface](functions-api.md#python-native-functions)) that simply reverses
string inputs:
```python
def process(input):
@@ -433,7 +433,7 @@ Delivery semantics | Description
**At-least-once** delivery | Each message that is sent to the function could
be processed more than once (hence the "at least")
**Effectively-once** delivery | Each message that is sent to the function will
have one output associated with it
-This command, for example, would run a function in [cluster
mode](#cluster-mode) with effectively-once guarantees applied:
+This command, for example, would run a function in [cluster
mode](#cluster-run-mode) with effectively-once guarantees applied:
```bash
$ bin/pulsar-admin functions create \
@@ -444,7 +444,7 @@ $ bin/pulsar-admin functions create \
## Metrics
-Pulsar Functions that use the [Pulsar Functions SDK](#sdk) can publish metrics
to Pulsar. For more information, see [Metrics for Pulsar
Functions](functions-metrics.md).
+Pulsar Functions that use the [Pulsar Functions
SDK](#the-pulsar-functions-sdk) can publish metrics to Pulsar. For more
information, see [Metrics for Pulsar Functions](functions-metrics.md).
## State storage
diff --git a/site2/docs/functions-quickstart.md
b/site2/docs/functions-quickstart.md
index d50ae9c417..980a2c6a96 100644
--- a/site2/docs/functions-quickstart.md
+++ b/site2/docs/functions-quickstart.md
@@ -4,7 +4,7 @@ title: Getting started with Pulsar Functions
sidebar_label: Getting started
---
-This tutorial will walk you through running a
[standalone](reference-teminology.md#standalone) Pulsar
[cluster](reference-teminology.md#cluster) on your machine and then running
your first Pulsar Functions using that cluster. The first function will run in
local run mode (outside your Pulsar
[cluster](reference-teminology.md#cluster)), while the second will run in
cluster mode (inside your cluster).
+This tutorial will walk you through running a
[standalone](reference-terminology.md#standalone) Pulsar
[cluster](reference-terminology.md#cluster) on your machine and then running
your first Pulsar Functions using that cluster. The first function will run in
local run mode (outside your Pulsar
[cluster](reference-terminology.md#cluster)), while the second will run in
cluster mode (inside your cluster).
> In local run mode, your Pulsar Function will communicate with your Pulsar
> cluster but will run outside of the cluster.
@@ -14,12 +14,12 @@ In order to follow along with this tutorial, you'll need to
have [Maven](https:/
## Run a standalone Pulsar cluster
-In order to run our Pulsar Functions, we'll need to run a Pulsar cluster
locally first. The easiest way to do that is to run Pulsar in
[standalone](reference-teminology.md#standalone) mode. Follow these steps to
start up a standalone cluster:
+In order to run our Pulsar Functions, we'll need to run a Pulsar cluster
locally first. The easiest way to do that is to run Pulsar in
[standalone](reference-terminology.md#standalone) mode. Follow these steps to
start up a standalone cluster:
```bash
-$ wget
https://repository.apache.org/content/repositories/snapshots/org/apache/pulsar/distribution/2.0.0-incubating-SNAPSHOT/distribution-2.0.0-incubating-{{
site.preview_version_id }}-bin.tar.gz
-$ tar xvf distribution-2.0.0-incubating-{{ site.preview_version_id
}}-bin.tar.gz
-$ cd apache-pulsar-2.0.0-incubating-SNAPSHOT
+$ wget pulsar:binary_release_url
+$ tar xvfz apache-pulsar-{{pulsar:version}}-bin.tar.gz
+$ cd apache-pulsar-{{pulsar:version}}
$ bin/pulsar standalone \
--advertised-address 127.0.0.1
```
@@ -60,7 +60,7 @@ $ bin/pulsar-admin functions localrun \
> --inputs topic1,topic2
> ```
-We can open up another shell and use the
[`pulsar-client`](reference-pulsar-admin.md#pulsar-client) tool to listen for
messages on the output topic:
+We can open up another shell and use the
[`pulsar-client`](reference-cli-tools.md#pulsar-client) tool to listen for
messages on the output topic:
```bash
$ bin/pulsar-client consume persistent://public/default/exclamation-output \
@@ -95,7 +95,7 @@ Here's what happened:
## Run a Pulsar Function in cluster mode
-[Local run mode](#local-run-mode) is useful for development and
experimentation, but if you want to use Pulsar Functions in a real Pulsar
deployment, you'll want to run them in **cluster mode**. In this mode, Pulsar
Functions run *inside* your Pulsar cluster and are managed using the same
[`pulsar-admin functions`](reference-pulsar-admin.md#pulsar-admin-functions)
interface that we've been using thus far.
+[Local run mode](#run-a-pulsar-function-in-local-run-mode) is useful for
development and experimentation, but if you want to use Pulsar Functions in a
real Pulsar deployment, you'll want to run them in **cluster mode**. In this
mode, Pulsar Functions run *inside* your Pulsar cluster and are managed using
the same [`pulsar-admin functions`](reference-pulsar-admin.md#functions)
interface that we've been using thus far.
This command, for example, would deploy the same exclamation function we ran
locally above *in our Pulsar cluster* (rather than outside it):
@@ -208,7 +208,7 @@ If you see `Deleted successfully` in the output, then
you've succesfully run, up
## Writing and running a new function
-> In order to write and run the [Python](functions-api.md#python) function
below, you'll need to install a few dependencies:
+> In order to write and run the
[Python](functions-api.md#functions-for-python) function below, you'll need to
install a few dependencies:
> ```bash
> $ pip install pulsar-client protobuf futures grpcio grpcio-tools
> ```
@@ -241,7 +241,7 @@ $ bin/pulsar-admin functions create \
--name reverse
```
-If you see `Created successfully`, the function is ready to accept incoming
messages. Because the function is running in cluster mode, we can **trigger**
the function using the
[`trigger`](reference-pulsar-admin.md#pulsar-admin-functions-trigger) command.
This command will send a message that we specify to the function and also give
us the function's output. Here's an example:
+If you see `Created successfully`, the function is ready to accept incoming
messages. Because the function is running in cluster mode, we can **trigger**
the function using the [`trigger`](reference-pulsar-admin.md#trigger) command.
This command will send a message that we specify to the function and also give
us the function's output. Here's an example:
```bash
$ bin/pulsar-admin functions trigger \
@@ -257,7 +257,7 @@ You should get this output:
This string was backwards but is now forwards
```
-Once again, success! We created a brand new Pulsar Function, deployed it in
our Pulsar standalone cluster in [cluster mode](#cluster-mode) and successfully
triggered the function. If you're ready for more, check out one of these docs:
+Once again, success! We created a brand new Pulsar Function, deployed it in
our Pulsar standalone cluster in [cluster
mode](#run-a-pulsar-function-in-cluster-mode) and successfully triggered the
function. If you're ready for more, check out one of these docs:
* [The Pulsar Functions API](functions-api.md)
* [Deploying Pulsar Functions](functions-deploying.md)
diff --git a/site2/docs/functions-state.md b/site2/docs/functions-state.md
new file mode 100644
index 0000000000..ade331bf52
--- /dev/null
+++ b/site2/docs/functions-state.md
@@ -0,0 +1,118 @@
+---
+id: functions-state
+title: Pulsar Functions State Storage (Developer Preview)
+sidebar_label: State Storage
+---
+
+Since Pulsar 2.1.0 release, Pulsar integrates with Apache BookKeeper [table
service](https://docs.google.com/document/d/155xAwWv5IdOitHh1NVMEwCMGgB28M3FyMiQSxEpjE-Y/edit#heading=h.56rbh52koe3f)
+for storing the `State` for functions. For example, A `WordCount` function can
store its `counters` state into BookKeeper's table service via Pulsar Functions
[State API](#api).
+
+## API
+
+### Java API
+
+Currently Pulsar Functions expose following APIs for mutating and accessing
State. These APIs are avaible in the [Context](functions-api.md#context) object
when
+you are using [Java SDK](functions-api.md#java-sdk-functions) functions.
+
+#### incrCounter
+
+```java
+ /**
+ * Increment the builtin distributed counter refered by key
+ * @param key The name of the key
+ * @param amount The amount to be incremented
+ */
+ void incrCounter(String key, long amount);
+```
+
+Application can use `incrCounter` to change the counter of a given `key` by
the given `amount`.
+
+#### getCounter
+
+```java
+ /**
+ * Retrieve the counter value for the key.
+ *
+ * @param key name of the key
+ * @return the amount of the counter value for this key
+ */
+ long getCounter(String key);
+```
+
+Application can use `getCounter` to retrieve the counter of a given `key`
mutated by `incrCounter`.
+
+Besides the `counter` API, Pulsar also exposes a general key/value API for
functions to store
+general key/value state.
+
+#### putState
+
+```java
+ /**
+ * Update the state value for the key.
+ *
+ * @param key name of the key
+ * @param value state value of the key
+ */
+ void putState(String key, ByteBuffer value);
+```
+
+#### getState
+
+```
+ /**
+ * Retrieve the state value for the key.
+ *
+ * @param key name of the key
+ * @return the state value for the key.
+ */
+ ByteBuffer getState(String key);
+```
+
+### Python API
+
+State currently is not supported at [Python
SDK](functions-api.md#python-sdk-functions).
+
+## Query State
+
+A Pulsar Function can use the [State API](#api) for storing state into
Pulsar's state storage
+and retrieving state back from Pulsar's state storage. Additionally Pulsar
also provides
+CLI commands for querying its state.
+
+```shell
+$ bin/pulsar-admin functions querystate \
+ --tenant <tenant> \
+ --namespace <namespace> \
+ --name <function-name> \
+ --state-storage-url <bookkeeper-service-url> \
+ --key <state-key> \
+ [---watch]
+```
+
+If `--watch` is specified, the CLI will watch the value of the provided
`state-key`.
+
+## Example
+
+### Java Example
+
+{@inject:
github:`WordCountFunction`:/pulsar-functions/java-examples/src/main/java/org/apache/pulsar/functions/api/examples/WordCountFunction.java}
is a very good example
+demonstrating on how Application can easily store `state` in Pulsar Functions.
+
+```java
+public class WordCountFunction implements Function<String, Void> {
+ @Override
+ public Void process(String input, Context context) throws Exception {
+ Arrays.asList(input.split("\\.")).forEach(word ->
context.incrCounter(word, 1));
+ return null;
+ }
+}
+```
+
+The logic of this `WordCount` function is pretty simple and straightforward:
+
+1. The function first splits the received `String` into multiple words using
regex `\\.`.
+2. For each `word`, the function increments the corresponding `counter` by 1
(via `incrCounter(key, amount)`).
+
+### Python Example
+
+State currently is not supported at [Python
SDK](functions-api.md#python-sdk-functions).
+
diff --git a/site2/website/sidebars.json b/site2/website/sidebars.json
index 48a50d43da..a65629c75a 100644
--- a/site2/website/sidebars.json
+++ b/site2/website/sidebars.json
@@ -24,6 +24,7 @@
"functions-api",
"functions-deploying",
"functions-guarantees",
+ "functions-state",
"functions-metrics"
],
"Pulsar IO": [
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services