leopd commented on issue #3946: When predicting, does mxnet provide thread-safe interface? URL: https://github.com/apache/incubator-mxnet/issues/3946#issuecomment-327942322 Tl;dr: if you use a high performance python web-server like gunicorn, you'll get what you want. You'll have lots of cores running in parallel working on different requests, and you won't need a copy in memory of the model for each worker. Getting a complex codebase to be threadsafe is no small task, so this won't be "resolved" any time soon. Fortunately it's not necessary here for most of what you want. Your answer lies in the Unix fork() command. If you want to understand, go read: https://en.wikipedia.org/wiki/Fork_(system_call) The magic of fork is copy-on-write memory semantics whereby each forked worker has its own virtual memory address space, but they all share the same physical memory. (Until one of them writes to the shared memory, in which case a private copy of that memory block is made in that process's virtual address space -- thus "copy-on-write".) So even though it's not multi-threaded, fork() & pre-fork worker servers like gunicorn let you accomplish almost the same thing with multiple processes instead of threads. Forked processes are somewhat more heavyweight than threads, but they're nowhere near as expensive as you running the same command multiple times. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
With regards, Apache Git Services
