sijia-w commented on a change in pull request #7904: URL: https://github.com/apache/pulsar/pull/7904#discussion_r479957540
########## File path: site2/docs/functions-package.md ########## @@ -0,0 +1,524 @@ +--- +id: functions-package +title: Package Pulsar Functions +sidebar_label: "How-to: Package" +--- + +This section provides step-by-step instructions to package Pulsar functions in Java, Python, and Go. + +> **Tip** +> +> - Packaging a window function in Java is the same as [packaging a function in Java](#java) as below. +> +> - Currently, the window function is not available in Python and Go. + +## Prerequisite + +Before running a Pulsar function, you need to start Pulsar. + +### Run a standalone Pulsar in Docker + +This example uses Docker to run a standalone Pulsar. + +```bash +docker run -it \ + -p 6650:6650 \ + -p 8080:8080 \ + -v $PWD/data:/pulsar/data \ + apachepulsar/pulsar:latest \ + bin/pulsar standalone +``` + +> **Tip** +> +> - `$PWD/data` is the local directory. `-v` maps the `/pulsar/data` directory in the Docker image to the local `$PWD/data` directory. +> +> - To check whether the image starts up or not, use the command `docker ps`. + +### Run Pulsar cluster in k8s + +For how to deploy Pulsar cluster in the k8s environment, For details, refer to [here](https://pulsar.apache.org/docs/en/helm-overview/). + + +## Java + +This example demonstrates how to package a function in Java. + +> **Note** +> +> This example assumes that you have [run a standalone Pulsar in Docker](#Run a standalone Pulsar in Docker) successfully. + + +1. Create a new maven project with a pom file. + + > **Tip** + > + > `mainClass` is your package name. + + ```text + <?xml version="1.0" encoding="UTF-8"?> + <project xmlns="http://maven.apache.org/POM/4.0.0" + xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" + xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> + <modelVersion>4.0.0</modelVersion> + + <groupId>java-function</groupId> + <artifactId>java-function</artifactId> + <version>1.0-SNAPSHOT</version> + + <dependencies> + <dependency> + <groupId>org.apache.pulsar</groupId> + <artifactId>pulsar-functions-api</artifactId> + <version>2.6.0</version> + </dependency> + </dependencies> + + <build> + <plugins> + <plugin> + <artifactId>maven-assembly-plugin</artifactId> + <configuration> + <appendAssemblyId>false</appendAssemblyId> + <descriptorRefs> + <descriptorRef>jar-with-dependencies</descriptorRef> + </descriptorRefs> + <archive> + <manifest> + <mainClass>org.example.test.ExclamationFunction</mainClass> + </manifest> + </archive> + </configuration> + <executions> + <execution> + <id>make-assembly</id> + <phase>package</phase> + <goals> + <goal>assembly</goal> + </goals> + </execution> + </executions> + </plugin> + <plugin> + <groupId>org.apache.maven.plugins</groupId> + <artifactId>maven-compiler-plugin</artifactId> + <configuration> + <source>8</source> + <target>8</target> + </configuration> + </plugin> + </plugins> + </build> + + </project> + ``` + +2. Write a Java function. + + ``` + package org.example.test; + + import java.util.function.Function; + + public class ExclamationFunction implements Function<String, String> { + @Override + public String apply(String s) { + return "This is my function!"; + } + } + ``` + + > **Tip** + > + > For the package imported, you can use one of the following interfaces: + > + > - Function interface provided by Java 8: `java.util.function.Function` + > + > - Pulsar Function interface: `org.apache.pulsar.functions.api.Function` + > + > The main difference between the two interfaces is that the `org.apache.pulsar.functions.api.Function` interface provides the context interface. When you write a function and want to interact with it, you can use context to obtain a wide variety of information and functionality for Pulsar Functions. + > + > **Example** + > + > This example uses `org.apache.pulsar.functions.api.Function` interface with context. + > + > ``` + > package org.example.functions; + > + > import org.apache.pulsar.functions.api.Context; + > import org.apache.pulsar.functions.api.Function; + > + > import java.util.Arrays; + > + > public class WordCountFunction implements Function<String, Void> { + > // This function is invoked every time a message is published to the input topic + > @Override + > public Void process(String input, Context context) throws Exception { + > Arrays.asList(input.split(" ")).forEach(word -> { + > String counterKey = word.toLowerCase(); + > context.incrCounter(counterKey, 1); + > }); + > return null; + > } + > } + > ``` + +3. Package the Java function. + + ```bash + mvn package + ``` + + After the Java function is packaged, a `target` directory is automatically created. Open the `target` directory to see if there is a jar package similar to `java-function-1.0-SNAPSHOT.jar`. + + +4. Run the Java function. + + (1) Copy the packaged jar file to the Pulsar image. + + ```bash + docker exec -it [CONTAINER ID] /bin/bash + docker cp <path of java-function-1.0-SNAPSHOT.jar> CONTAINER ID:/pulsar + ``` + + (2) Run the Java function using the following command. + + ```bash + ./bin/pulsar-admin functions localrun \ + --classname org.example.test.ExclamationFunction \ + --jar java-function-1.0-SNAPSHOT.jar \ + --inputs persistent://public/default/my-topic-1 \ + --output persistent://public/default/test-1 \ + --tenant public \ + --namespace default \ + --name JavaFunction + ``` + + The following log indicates that the Java function starts up successfully. + + ```text + ... + 07:55:03.724 [main] INFO org.apache.pulsar.functions.runtime.ProcessRuntime - Started process successfully + ... + ``` + + > **Tip** + > + > - For the description about the parameters (for example, `--classname`, `--jar`, `--inputs`, and so on), run the command `./bin/pulsar-admin functions` to get more information or see [here](http://pulsar.apache.org/docs/en/pulsar-admin/#functions). + > + > - If you want to start a function in cluster mode, replace `localrun` with `create` in the command above. The following log indicates that the Java function starts up successfully. + > + > ```text + > "Created successfully" + > ``` + +## Python + +Python Function supports the following three forms: + +- One python file +- ZIP file +- PIP + +### One python file + +This example demonstrates how to package a function by **one python file** in Python. + +> **Note** +> +> This example assumes that you have [run a standalone Pulsar in Docker](#Run a standalone Pulsar in Docker) successfully. + +1. Write a Python function. + + ``` + from pulsar import Function // import the Function module from Pulsar + + # The classic ExclamationFunction that appends an exclamation at the end + # of the input + class ExclamationFunction(Function): + def __init__(self): + pass + + def process(self, input, context): + return input + '!' + ``` + + In this example, when you write a Python function, you need to inherit the Function class and implement the `process()` method. + + `process()` mainly has two parameters: + + - `input` represents your input. + + - `context` represents an interface exposed by the Pulsar Function. You can get the attributes in the Python function based on the provided context object. + +2. Install a Python client. + + The implementation of a Python function depends on the Python client, so before deploying a Python function, you need to install the corresponding version of the Python client. + + ```bash + pip install python-client==2.6.0 + ``` + +3. Run the Python Function. + + (1) Copy the Python function file to the Pulsar image. + + ```bash + docker exec -it [CONTAINER ID] /bin/bash + docker cp <path of Python function file> CONTAINER ID:/pulsar + ``` + + (2) Run the Python function using the following command. + + ```bash + ./bin/pulsar-admin functions localrun \ + --classname org.example.test.ExclamationFunction \ + --py <path of Python Function file> \ + --inputs persistent://public/default/my-topic-1 \ + --output persistent://public/default/test-1 \ + --tenant public \ + --namespace default \ + --name PythonFunction + ``` + + The following log indicates that the Python function starts up successfully. + + ```text + ... + 07:55:03.724 [main] INFO org.apache.pulsar.functions.runtime.ProcessRuntime - Started process successfully + ... + ``` + + > **Tip** + > + > - For the description about the parameters (for example, `--classname`, `--py`, `--inputs`, and so on), run the command `./bin/pulsar-admin functions` to get more information or see [here](http://pulsar.apache.org/docs/en/pulsar-admin/#functions). + > + > - If you want to start a function in cluster mode, replace `localrun` with `create` in the command above. The following log indicates that the Python function starts up successfully. + > + > ```text + > "Created successfully" + > ``` + +### ZIP file + +This example demonstrates how to package a function by **ZIP file** in Python. + +> **Note** +> +> This example assumes that you have [run a standalone Pulsar in Docker](#Run a standalone Pulsar in Docker) successfully. + +1. Prepare the ZIP file + +When packaging the ZIP file of the Python Function, the following requirements need to be met: + +```text +Assuming zip file with format `func.zip`, extract to folder function and internal dir format: + "func/src" + "func/requirements.txt" + "func/deps" +``` +Now we take [exclamation.zip](https://github.com/apache/pulsar/tree/master/tests/docker-images/latest-version-image/python-examples) as an example, its internal structure is as follows: + +```text +. +├── deps +│ └── sh-1.12.14-py2.py3-none-any.whl +└── src + └── exclamation.py +``` + +2. Run the Python Function + + (1) Copy the ZIP file to the Pulsar image. + + ```bash + docker exec -it [CONTAINER ID] /bin/bash + docker cp <path of ZIP file> CONTAINER ID:/pulsar + ``` + + (2) Run the Python function using the following command. + + ```bash + ./bin/pulsar-admin functions localrun \ + --classname exclamation \ + --py <path of ZIP file> \ + --inputs persistent://public/default/in-topic \ + --output persistent://public/default/out-topic \ + --tenant public \ + --namespace default \ + --name PythonFunction + ``` + + The following log indicates that the Python function starts up successfully. + + ```text + ... + 07:55:03.724 [main] INFO org.apache.pulsar.functions.runtime.ProcessRuntime - Started process successfully + ... + ``` + + > **Tip** + > + > - For the description about the parameters (for example, `--classname`, `--py`, `--inputs`, and so on), run the command `./bin/pulsar-admin functions` to get more information or see [here](http://pulsar.apache.org/docs/en/pulsar-admin/#functions). Review comment: ```suggestion > - For the description about the parameters (for example, `--classname`, `--py`, `--inputs`, and so on), run the command `./bin/pulsar-admin functions` or see [here](http://pulsar.apache.org/docs/en/pulsar-admin/#functions). ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
