milesrichardson commented on issue #177:
URL: 
https://github.com/apache/arrow-datafusion/issues/177#issuecomment-1297918179

   Good news, fellow WebAssembly enthusiasts! It looks like the stars are 
finally aligning, and with relatively minimal patching, I successfully compiled 
the [code from the gist (create, insert and query a 
`MemTable`)](https://gist.github.com/roee88/91f2b67c3e180fa0dfb688ba8d923dae) 
to `wasm32-wasi` and `wasm32-unknown-unknown`, and ran it in `wasmedge` and the 
browser (via `wasmpack`):
   
   ```
   ❯ docker run --rm -it -v $(pwd)/target/wasm32-wasi/debug:/app 
wasmedge/slim:0.11.2-rc.1 wasmedge --reactor dfwasm.wasm _start
   +---+----+
   | a | b  |
   +---+----+
   | b | 10 |
   | c | 10 |
   +---+----+
   0
   ```
   
   
![image](https://user-images.githubusercontent.com/835921/199139422-194d12fa-8210-4c45-894b-5e47b4e9f8c4.png)
   
   I pushed the [proof-of-concept to a public repository at 
`splitgraph/experimental-datafusion-webassembly`](https://github.com/splitgraph/experimental-datafusion-webassembly).
 There are two branches:
   
   - 
[`wasm32-wasi`](https://github.com/splitgraph/experimental-datafusion-webassembly/tree/wasm32-wasi)
       - This is the target I got working first. The readme on this branch 
contains all the details and you should be able to reproduce it yourself.
   - 
[`wasm32-unknown-unknown`](https://github.com/splitgraph/experimental-datafusion-webassembly/tree/wasm32-unknown-unknown)
       - This is branched from `wasm32-wasi` and the diff of 
[`wasm32-wasi..wasm32-unknown-unknown`](https://github.com/splitgraph/experimental-datafusion-webassembly/compare/wasm32-wasi...wasm32-unknown-unknown?expand=1)
 shows the changes
       - The top of the readme includes instructions for running this in the 
browser, but the patch is still very messy and might not be easily 
reproducible. Make sure you check Cargo.toml for any patched crates that you 
need to have checked out at a local path.
   
   In the near future, I intend to cleanup these changes and submit a PR to 
DataFusion feature-flagging WebAssembly support. 
   
   In general, the summary of requirements for `wasm-wasi`:
   
   - Patching `arrow` to use its upstream Git repository fixed some initial 
compilation issues, but otherwise no changes were required. It seems like the 
latest release on `master` of `arrow-rs` (`v26.0.0`) can compile to both 
webassembly targets, and so DataFusion just needs to upgrade to that. (There 
was also [one minor change to 
`datafusion/physical-expr`](https://github.com/splitgraph/experimental-datafusion-webassembly/blob/wasm32-wasi/datafusion.patch#L361-L363)
 required to support the upgraded Arrow package). 
   - Compiling with [`RUSTFLAGS="--cfg 
tokio_unstable"`](https://docs.rs/tokio/latest/tokio/index.html#unstable-features)
 is necessary, and will benefit from [recently stabilized wasm support in 
Tokio](https://github.com/tokio-rs/tokio/pull/4716)
   - Removing cloud providers for `object_store`, as they are [not compatible 
with the 
`wasm32-unknown-unknown`](https://github.com/apache/arrow-rs/tree/master/object_store#support-for-wasm32-unknown-unknown-target)
 target (they may be with `wasm32-wasi` in some runtimes, but I disabled them).
   - Removing local file system calls/writes (at least where necessary)
   - Removing `bzip2`
   - Using a [forked 
`reqwest`](https://github.com/samdenty/reqwest/blob/master/Cargo.toml) to add 
some wasm compatibility (note: I'm not sure how much of this was/is necessary, 
and/or if it forces tokio to resolve to a version that other packages 
incidentally benefit from)
   - [Replacing two calls to `spawn_blocking` in `sort` with 
`spawn`](https://github.com/splitgraph/experimental-datafusion-webassembly/blob/wasm32-wasi/datafusion.patch#L284-L286),
 making the compiler happy but possibly causing runtime/logic errors
   - ...and some other stuff in the patch
   
   for `wasm32-unknown-unknown`, in addition to all those requirements, [it was 
also necessary 
to](https://github.com/splitgraph/experimental-datafusion-webassembly/compare/wasm32-wasi...wasm32-unknown-unknown):
   
   - Replace usage of `std::time` with `Instant`, in both 
[`datafusion`](https://github.com/splitgraph/experimental-datafusion-webassembly/blob/wasm32-unknown-unknown/datafusion.patch)
 and 
[`arrow`](https://github.com/splitgraph/experimental-datafusion-webassembly/blob/wasm32-unknown-unknown/arrow-rs.patch)
 
   - Make sure every library that calls `getrandom` is also passing it the `js` 
feature flag, which I did by just patching `getrandom` and making that the 
default
   
   To get it to _run_ (without a runtime error related to `std::time` being 
unreachable), a few more changes were made:
   
   - Don't run the demo code in a Tokio `main` runtime, even with `flavor = 
current-thread`.  Instead, use 
[wasm-bindgen-futures](https://rustwasm.github.io/wasm-bindgen/api/wasm_bindgen_futures/fn.spawn_local.html)
 to await a future that performs the asynchronous task that calls datafusion
   
   
   **This is all very messy.** I will clean it up and submit a PR to DataFusion 
once I have a better sense of the most minimal changes required and the proper 
way to feature flag them. Also, general disclaimer that I'm new to Rust and 
YMMV, especially on the `wasm-unknown-unknown` patch - after all, I barely got 
it to run. But it does compile and create and query a small in-memory table, 
which is pretty good!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to