Thank you all for your feedback.

@Chris: Yes, One of the Amazon user(Calum Leslie) had contributed the
Dispose Pattern removing the free of native handles in Finalizers and
instead added Log. This was done because calling free in Finalizers was
segfaulting the application at random points and was very hard to reproduce
and debug.
The dispose pattern worked for some cases but made code cumbersome from a
readability aspect, keeping track of all the objects that were
created(imagine slice/reshape instead of writing expressions you are now
creating unnecessary variables and calling dispose on them).
As the 1st graph in the design shows despite carefully calling dispose on
most objects, there was constant memory leak and diagnosing leaks wasn't
straightforward. Note that Finalizers run on a separate thread later than
the object was found unreachable.

@Timur, thanks for the feedback.
1) No, the goal here is to manage Native memory that is created for various
operations. In MXNet-Scala most objects are in C++ Heap and Scala objects
are wrappers around it, the MXNet engine when it runs operations expects
objects to be accessible in C++ Heap.

2) Agree MNIST is not representative, the goal was to understand and show
that the existing code has hard to debug memory leaks(even for MNIST). I
was aiming to test my prototype code and see if my changes make a
difference. Yizhi suggested I run tests against RESNET50 model which I will
do as a part of my implementation. I think this is a standard benchmark
model that is widely used. Also note that most of MXNet-Scala's use-case
that we have seen is for Inference.

3) No, we haven't created a branch for Java-API work, please look at this
design and kindly leave your feedback:
https://cwiki.apache.org/confluence/display/MXNET/MXNet+Java+Inference+API

4) Calling System.gc() will be configurable(including don't call GC), one
of the feedback that I got from a User is calling System.gc on the user's
behalf is intrusive which i think is also the point you are making.

5) understood and agree, I see the calling GC as only a part of the
solution and configurable option. For using GPUs, training and other memory
intensive application ResourceScope is be a very good option.

Another alternative is to create Bytebuffers in Java and map the C++
pointers to JVM heap by tapping to the native malloc/free that way JVM is
aware of all the memory that is allocated and can free appropriately
whenever the objects becomes unreachable. I have to note that this still
does not solve the problem of accumulating memory until GC has kicked in.
This approach is too very involved and might not be tenable.

@Marco, thanks for your comments.
1) JVM kicks of GC when it feels pressure on JVM Heap not CPU RAM. Objects
on GPU are no special they are still off-heap(JVM Heap) so this would work,
look at the graph that show running GAN example on GPUs in the doc.

2) I am not looking to rewrite the Memory Allocation in MXNet, that will
still be handled by the C++ backend, the goal here is to free(reduce of
shared pointer count) native-memory when JVM objects go out of scope(become
unreachable).


@Carin, yes hopefully this would alleviate the memory management headache
for our users.

Hope that makes sense.

Thanks, Naveen


On Wed, Sep 12, 2018 at 6:06 AM, Carin Meier <carinme...@gmail.com> wrote:

> Naveen,
>
> Thanks for putting together the detailed document and kickstarting this
> effort. It will benefit all the MXNet JVM users and will help solve a
> current pain point for them.
>
> - Carin
>
> On Tue, Sep 11, 2018 at 5:37 PM Naveen Swamy <mnnav...@gmail.com> wrote:
>
> > Hi All,
> >
> > I am working on managing Off-Heap Memory Management and have written a
> > proposal here based on my prototype and research I did.
> >
> > Please review the doc and provide your feedback ?
> >
> > https://cwiki.apache.org/confluence/display/MXNET/JVM+Memory+Management
> >
> > I had offline discussion with a few people I work with and added their
> > feedback to the doc as well.
> >
> > Thanks, Naveen
> >
>

Reply via email to