leopd commented on issue #3946: When predicting, does mxnet provide thread-safe 
interface?
URL: 
https://github.com/apache/incubator-mxnet/issues/3946#issuecomment-327942322
 
 
   Tl;dr: if you use a high performance python web-server like gunicorn, you'll 
get what you want.  You'll have lots of cores running in parallel working on 
different requests, and you won't need a copy in memory of the model for each 
worker.
   
   Getting a complex codebase to be threadsafe is no small task, so this won't 
be "resolved" any time soon.  Fortunately it's not necessary here for most of 
what you want.  Your answer lies in the Unix fork() command.  If you want to 
understand, go read: https://en.wikipedia.org/wiki/Fork_(system_call)
   
   The magic of fork is copy-on-write memory semantics whereby each forked 
worker has its own virtual memory address space, but they all share the same 
physical memory.  (Until one of them writes to the shared memory, in which case 
a private copy of that memory block is made in that process's virtual address 
space -- thus "copy-on-write".)  So even though it's not multi-threaded, fork() 
& pre-fork worker servers like gunicorn let you accomplish almost the same 
thing with multiple processes instead of threads.  Forked processes are 
somewhat more heavyweight than threads, but they're nowhere near as expensive 
as you running the same command multiple times.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to