joemarshall commented on issue #35176: URL: https://github.com/apache/arrow/issues/35176#issuecomment-1514814122
On emscripten - in browser, local disk is memory based, and may or may not be synced to some kind of permanent storage (via an asynchronous syncfs call). Access to this disk is synchronous, but very quick because it is in memory. In node, you can use the real file system directly. Network is weird, because it is hosted in browsers typically - for http / https one can call out to javascript to use the fetch api, which is asynchronous. Right now there's only async I/O for network with the exception of xmlhttprequest if you're in a web-worker, which is a hacky workaround for synchronous http access. In theory there's also a websockets wrapper which turns socket calls in C into websocket calls to the hosting server, but I don't know how well it works. Basically, as I understand it, the potential in emscripten for arrow is: 1) Local file system stuff should just work, if it can be read without threads (I had code reading a parquet file which worked okay) 2) Network things (e.g reading from s3) would probably require porting work for things that work over http or websockets to work. Anything with a REST api or websockets api should be fine. Things that require direct connections or making servers won't work. 3) I think this means that flight is going to be quite limited in its usefulness in webassembly, so I haven't even thought about compiling that. Personally, for what I want, I just want core arrow with file support to work on emscripten - I think that is a decent starting point before getting into complexities. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
