loongs-zhang commented on code in PR #21992:
URL: https://github.com/apache/pulsar/pull/21992#discussion_r1494060140
##########
pip/pip-331.md:
##########
@@ -0,0 +1,129 @@
+# PIP-331: WASM Function API
+
+# Background knowledge
+
+WASM(WebAssembly) bytecode is designed to be encoded in a size- and
load-time-efficient binary format. WASM aims to leverage the common hardware
features available on various platforms to execute in browsers at machine code
speed.
+
+WASI(WebAssembly System Interface) provide a portable interface for
applications that run within a constrained sandbox environment, which allows
WASM to run in non browser environments such as Linux. It's portable and secure.
+
+# Motivation
+
+The server and client sides of the Pulsar function use protobuf for
decoupling. In principle, the language supported by protobuf can be supported
by the pulsar function, now Pulsar provided the java, python and golang
function client, but there are still many languages that are not supported.
+
+Before all language adaptations are completed (and it's almost entirely
certain to be impossible), users cannot write pulsar function in their familiar
languages.
+
+# Goals
+
+## In Scope
+
+Other languages, as long as their code can be compiled into WASM bytecode
(such as Rust/golang/C++), users can use these languages to write pulsar
function.
+
+## Out of Scope
+
+All existing abilities of the Java pulsar function client are not
reimplemented, the WASM Pulsar functions is under the Java Pulsar functions.
+
+Due to the strict requirements of WASM on parameter types and for simplicity
reasons, types other than `java.lang.Long` are not used as parameters or return
value.
+
+# High Level Design
+
+```mermaid
+flowchart LR;
+
+ subgraph develop
+ direction TB
+ SourceCode ==> |"CompileToWASM"| WasmFile ==> |"RenameFile"|
MoveToTheResourceDirectory ==> UnitTest
+ end
+
+ subgraph runtime
+ direction TB
+ PulsarFunctionJava ==> |"LoadFromResource"| TheWasmFile ==> |"Invoke"|
TheSourceCode
+ end
+
+ develop --> runtime
+```
+
+# Detailed Design
+
+## Design & Implementation Details
+
+1. add `WasmLoader` to load WASM file and provide the WASM function to java,
also provide the java function to WASM if we need.
+
+2. add `AbstractWasmFunction` and `AbstractWasmWindowFunction` as the core
interface of the WASM function api.
+
+```java
+public abstract class AbstractWasmFunction<X, T> extends WasmLoader implements
Function<X, T> {
+
+ private static final String PROCESS_METHOD_NAME = "process";
+
+ protected static final String INITIALIZE_METHOD_NAME = "initialize";
+
+ protected static final String CLOSE_METHOD_NAME = "close";
+
+ protected static final Map<Long, Argument<?>> ARGUMENTS = new
ConcurrentHashMap<>();
+
+ @Override
+ public T process(X input, Context context) {
+ return super.getWasmExtern(PROCESS_METHOD_NAME)
+ .map(process -> {
+ Long argumentId = callWASI(input, context, process);
+ return doProcess(input, context, argumentId);
+ })
+ .orElseThrow(() -> new PulsarWasmException(
+ PROCESS_METHOD_NAME + " function not found in " +
super.getWasmName()));
+ }
+
+ private Long callWASI(X input,
+ Context context,
+ Extern process) {
+ // call WASI function
+ final Long argumentId = getArgumentId(input, context);
+ ARGUMENTS.put(argumentId, new Argument<>(input, context));
+ // WASI cannot easily pass Java objects like JNI, here we pass Long
+ // then we can get the argument by Long
+ WasmFunctions.consumer(super.getStore(), process.func(),
WasmValType.I64)
+ .accept(argumentId);
+ ARGUMENTS.remove(argumentId);
+ return argumentId;
+ }
+
+ protected abstract T doProcess(X input, Context context, Long argumentId);
+
+ protected abstract Long getArgumentId(X input, Context context);
+
+ @Override
+ public void initialize(Context context) {
+ super.getWasmExtern(INITIALIZE_METHOD_NAME)
+ .ifPresent(initialize -> callWASI(null, context, initialize));
+ }
+
+ @Override
+ public void close() {
+ super.getWasmExtern(CLOSE_METHOD_NAME)
+ .ifPresent(close -> callWASI(null, null, close));
+ super.close();
+ }
+
+ protected static class Argument<X> {
+ protected X input;
+ protected Context context;
+
+ private Argument(X input, Context context) {
+ this.input = input;
+ this.context = context;
+ }
+ }
+}
+```
+
+More detailed code implementation and test can be found in
[here](https://github.com/apache/pulsar/pull/21975)
+
+# Security Considerations
+
+Maybe need to add folders with tenancy name in the resource directory to
prevent conflicts between WASM file names of different tenancies.
Review Comment:
> What is the resource directory?
For example, `resource/{tenancyName}/{wasmFileName}.wasm`
> Is it shared today by Pulsar functions?
It should not be shared, adding `tenancy name` is just to avoid path
consistency conflicts.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]