joemarshall commented on issue #35176:
URL: https://github.com/apache/arrow/issues/35176#issuecomment-1514814122

   On emscripten - in browser, local disk is memory based, and may or may not 
be synced to some kind of permanent storage (via an asynchronous syncfs call). 
Access to this disk is synchronous, but very quick because it is in memory. In 
node, you can use the real file system directly. 
   
   Network is weird, because it is hosted in browsers typically - for http / 
https one can call out to javascript to use the fetch api, which is 
asynchronous. Right now there's only async I/O for network with the exception 
of xmlhttprequest if you're in a web-worker, which is a hacky workaround for 
synchronous http access. In theory there's also a websockets wrapper which 
turns socket calls in C into websocket calls to the hosting server, but I don't 
know how well it works.
   
   Basically, as I understand it, the potential in emscripten for arrow is:
   
   1) Local file system stuff should just work, if it can be read without 
threads (I had code reading a parquet file which worked okay)
   
   2) Network things (e.g reading from s3) would probably require porting work 
for things that work over http or websockets to work. Anything with a REST api 
or websockets api should be fine. Things that require direct connections or 
making servers won't work.
   
   3) I think this means that flight is going to be quite limited in its 
usefulness in webassembly, so I haven't even thought about compiling that.
   
   Personally, for what I want, I just want core arrow with file support to 
work on emscripten - I think that is a decent starting point before getting 
into complexities.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to