Hi Kellen,
Great to see some progress on this as it is one of the major problems we face
right now. Your approach seems to be a good fit for a short-/mid-term solution.
Have you also considered using some sort of signaling? As far as I understand
from your proposal and the example code, leverag
Good suggestion Kellen!
I like the idea, it will solve an existing deficiency in MXNet, that has
been worked around so far. As an example, the recently added Scala
inference API (part of 1.2RC) implemented a dispatcher in Scala to
workaround that limitation.
Would be great to better understand th
Hello MXNet developers,
I’ve recently been speaking with users who’d like to run parallel inference
requests with MXNet on their service. They’ll do this on GPUs, and due to
resource constraints, they’d like to do this without duplicating their
model’s weights in memory. They’d also like run i