loongs-zhang commented on code in PR #21992:
URL: https://github.com/apache/pulsar/pull/21992#discussion_r1519802390
##########
pip/pip-331.md:
##########
@@ -0,0 +1,129 @@
+# PIP-331: WASM Function API
+
+# Background knowledge
+
+WASM(WebAssembly) bytecode is designed to be encoded in a size- and
load-time-efficient binary format. WASM aims to leverage the common hardware
features available on various platforms to execute in browsers at machine code
speed.
+
+WASI(WebAssembly System Interface) provide a portable interface for
applications that run within a constrained sandbox environment, which allows
WASM to run in non browser environments such as Linux. It's portable and secure.
+
+# Motivation
+
+The server and client sides of the Pulsar function use protobuf for
decoupling. In principle, the language supported by protobuf can be supported
by the pulsar function, now Pulsar provided the java, python and golang
function client, but there are still many languages that are not supported.
+
+Before all language adaptations are completed (and it's almost entirely
certain to be impossible), users cannot write pulsar function in their familiar
languages.
+
+# Goals
+
+## In Scope
+
+Other languages, as long as their code can be compiled into WASM bytecode
(such as Rust/golang/C++), users can use these languages to write pulsar
function.
+
+## Out of Scope
+
+All existing abilities of the Java pulsar function client are not
reimplemented, the WASM Pulsar functions is under the Java Pulsar functions.
+
+Due to the strict requirements of WASM on parameter types and for simplicity
reasons, types other than `java.lang.Long` are not used as parameters or return
value.
+
+# High Level Design
+
+```mermaid
+flowchart LR;
+
+ subgraph develop
+ direction TB
+ SourceCode ==> |"CompileToWASM"| WasmFile ==> |"RenameFile"|
MoveToTheResourceDirectory ==> UnitTest
+ end
+
+ subgraph runtime
+ direction TB
+ PulsarFunctionJava ==> |"LoadFromResource"| TheWasmFile ==> |"Invoke"|
TheSourceCode
+ end
+
+ develop --> runtime
+```
+
+# Detailed Design
+
+## Design & Implementation Details
+
+1. add `WasmLoader` to load WASM file and provide the WASM function to java,
also provide the java function to WASM if we need.
+
+2. add `AbstractWasmFunction` and `AbstractWasmWindowFunction` as the core
interface of the WASM function api.
+
+```java
+public abstract class AbstractWasmFunction<X, T> extends WasmLoader implements
Function<X, T> {
+
+ private static final String PROCESS_METHOD_NAME = "process";
+
+ protected static final String INITIALIZE_METHOD_NAME = "initialize";
+
+ protected static final String CLOSE_METHOD_NAME = "close";
+
+ protected static final Map<Long, Argument<?>> ARGUMENTS = new
ConcurrentHashMap<>();
+
+ @Override
+ public T process(X input, Context context) {
+ return super.getWasmExtern(PROCESS_METHOD_NAME)
+ .map(process -> {
+ Long argumentId = callWASI(input, context, process);
+ return doProcess(input, context, argumentId);
+ })
+ .orElseThrow(() -> new PulsarWasmException(
+ PROCESS_METHOD_NAME + " function not found in " +
super.getWasmName()));
+ }
+
+ private Long callWASI(X input,
+ Context context,
+ Extern process) {
+ // call WASI function
+ final Long argumentId = getArgumentId(input, context);
+ ARGUMENTS.put(argumentId, new Argument<>(input, context));
Review Comment:
> Do you have any idea what is the performance implications for this step?
Comparing normal java function throughput compared with this WASM function?
I know this will bring a significant performance drop, but besides
serialization/deserialization, I can't think of any way to convert byte[] into
a Java object.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]