milesrichardson commented on issue #177: URL: https://github.com/apache/arrow-datafusion/issues/177#issuecomment-1297918179
Good news, fellow WebAssembly enthusiasts! It looks like the stars are finally aligning, and with relatively minimal patching, I successfully compiled the [code from the gist (create, insert and query a `MemTable`)](https://gist.github.com/roee88/91f2b67c3e180fa0dfb688ba8d923dae) to `wasm32-wasi` and `wasm32-unknown-unknown`, and ran it in `wasmedge` and the browser (via `wasmpack`): ``` ❯ docker run --rm -it -v $(pwd)/target/wasm32-wasi/debug:/app wasmedge/slim:0.11.2-rc.1 wasmedge --reactor dfwasm.wasm _start +---+----+ | a | b | +---+----+ | b | 10 | | c | 10 | +---+----+ 0 ```  I pushed the [proof-of-concept to a public repository at `splitgraph/experimental-datafusion-webassembly`](https://github.com/splitgraph/experimental-datafusion-webassembly). There are two branches: - [`wasm32-wasi`](https://github.com/splitgraph/experimental-datafusion-webassembly/tree/wasm32-wasi) - This is the target I got working first. The readme on this branch contains all the details and you should be able to reproduce it yourself. - [`wasm32-unknown-unknown`](https://github.com/splitgraph/experimental-datafusion-webassembly/tree/wasm32-unknown-unknown) - This is branched from `wasm32-wasi` and the diff of [`wasm32-wasi..wasm32-unknown-unknown`](https://github.com/splitgraph/experimental-datafusion-webassembly/compare/wasm32-wasi...wasm32-unknown-unknown?expand=1) shows the changes - The top of the readme includes instructions for running this in the browser, but the patch is still very messy and might not be easily reproducible. Make sure you check Cargo.toml for any patched crates that you need to have checked out at a local path. In the near future, I intend to cleanup these changes and submit a PR to DataFusion feature-flagging WebAssembly support. In general, the summary of requirements for `wasm-wasi`: - Patching `arrow` to use its upstream Git repository fixed some initial compilation issues, but otherwise no changes were required. It seems like the latest release on `master` of `arrow-rs` (`v26.0.0`) can compile to both webassembly targets, and so DataFusion just needs to upgrade to that. (There was also [one minor change to `datafusion/physical-expr`](https://github.com/splitgraph/experimental-datafusion-webassembly/blob/wasm32-wasi/datafusion.patch#L361-L363) required to support the upgraded Arrow package). - Compiling with [`RUSTFLAGS="--cfg tokio_unstable"`](https://docs.rs/tokio/latest/tokio/index.html#unstable-features) is necessary, and will benefit from [recently stabilized wasm support in Tokio](https://github.com/tokio-rs/tokio/pull/4716) - Removing cloud providers for `object_store`, as they are [not compatible with the `wasm32-unknown-unknown`](https://github.com/apache/arrow-rs/tree/master/object_store#support-for-wasm32-unknown-unknown-target) target (they may be with `wasm32-wasi` in some runtimes, but I disabled them). - Removing local file system calls/writes (at least where necessary) - Removing `bzip2` - Using a [forked `reqwest`](https://github.com/samdenty/reqwest/blob/master/Cargo.toml) to add some wasm compatibility (note: I'm not sure how much of this was/is necessary, and/or if it forces tokio to resolve to a version that other packages incidentally benefit from) - [Replacing two calls to `spawn_blocking` in `sort` with `spawn`](https://github.com/splitgraph/experimental-datafusion-webassembly/blob/wasm32-wasi/datafusion.patch#L284-L286), making the compiler happy but possibly causing runtime/logic errors - ...and some other stuff in the patch for `wasm32-unknown-unknown`, in addition to all those requirements, [it was also necessary to](https://github.com/splitgraph/experimental-datafusion-webassembly/compare/wasm32-wasi...wasm32-unknown-unknown): - Replace usage of `std::time` with `Instant`, in both [`datafusion`](https://github.com/splitgraph/experimental-datafusion-webassembly/blob/wasm32-unknown-unknown/datafusion.patch) and [`arrow`](https://github.com/splitgraph/experimental-datafusion-webassembly/blob/wasm32-unknown-unknown/arrow-rs.patch) - Make sure every library that calls `getrandom` is also passing it the `js` feature flag, which I did by just patching `getrandom` and making that the default To get it to _run_ (without a runtime error related to `std::time` being unreachable), a few more changes were made: - Don't run the demo code in a Tokio `main` runtime, even with `flavor = current-thread`. Instead, use [wasm-bindgen-futures](https://rustwasm.github.io/wasm-bindgen/api/wasm_bindgen_futures/fn.spawn_local.html) to await a future that performs the asynchronous task that calls datafusion **This is all very messy.** I will clean it up and submit a PR to DataFusion once I have a better sense of the most minimal changes required and the proper way to feature flag them. Also, general disclaimer that I'm new to Rust and YMMV, especially on the `wasm-unknown-unknown` patch - after all, I barely got it to run. But it does compile and create and query a small in-memory table, which is pretty good! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
