lburgazzoli commented on PR #21992:
URL: https://github.com/apache/pulsar/pull/21992#issuecomment-1988302919
> @lburgazzoli Brilliant. Can you explain how the function access the data
at a `Message` object in the WASM function - regardless of the language it is
written in?
At this stage the function expects that the payload/value is a `byte[]` and
does not perform any check if the actual byte array is something that the guest
function can process. It is up to the function to perform any validation if
needed.
The assumption is that, if you want to use something like WASM for
transformation, you are probably dealing with a payload type that can be
re-constructed in the target language i.e. it is fairly simple if the data
format is JSON, YAML, et, or you are not interested in the payload as the
function only acts on the `Message` metadata.
I guess this is pretty similar to how a gRPC alike Pulsar Function would
work.
So the signature of the function is:
```java
public Record<byte[]> process(byte[] input, Context context) throws
Exception {
// impl here
}
```
Then to access the individual part of the `Message`, the host registers a
number of functions, like
```java
wrap(
this::getValueFn,
"pulsar_get_value",
List.of(),
List.of(ValueType.I64)),
wrap(
this::setValueFn,
"pulsar_set_value",
List.of(ValueType.I32, ValueType.I32),
List.of()),
```
For which the related implementation is like:
```java
private Value[] getValueFn(Instance instance, Value... args) {
final byte[] rawData = this.ref.get().value();
return new Value[] {
write(rawData)
};
}
private Value[] setValueFn(Instance instance, Value... args) {
final int addr = args[0].asInt();
final int size = args[1].asInt();
final byte[] value = instance.memory().readBytes(addr, size);
this.ref.get().value(value);
return new Value[] {};
```
Since the core WASM spec does not support threads, the function
implementation adda a lock and essentially access to the current `Message` with
a `ThreadLocal` alike implementation.
The guest then access the functions exposed by the host like:

An example of a function written in rust is then:
```rust
#[cfg_attr(all(target_arch = "wasm32"), export_name = "to_upper")]
#[no_mangle]
pub extern fn to_upper() {
let val = get_record_value();
let res =
String::from_utf8(val).unwrap().to_uppercase().as_bytes().to_vec();
set_record_value(res);
}
```
> Can you also explain how does this differ from Component Model?
The difference from a component model is that, there is quite some work that
must be done on the implementation side so as an example, there is a sort of
[SDK](https://github.com/lburgazzoli/pulsar-function-wasm/tree/main/src/main/rust)
that helps writing functions, or the developer has to implement its own ABI
that matches the expectations of the host. The component model would free the
developer from that aspect but under the hoods, the generated code would
probably do something similar since you should always pass through the linear
memory for host/guest calls.
> Is it reasonable to just wait for Component Model to be implemented in a
runtime that can from Java instead of supporting two flavors?
At this stage I fear that waiting for the component model to become
mainstream could delay quite a lot the adoption of WASM and it may put a lot of
restriction to what runtimes a developer can use. However I really hope it will
emerge as soon as possible but given all the things happened with WASI, it may
take time (hope to be wrong on this)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]