On Sun, Mar 18, 2018 at 04:45:39PM -0400, Yasmine Dumouchel wrote:
> Hi, 
> 
> I have studied for the past year computer science at McGill University
> and am starting my Masters degree in Computer Science next September
> at the University of Montreal. I have looked at the ideas proposed for
> GSOC and would love to create an automatic binding for the mlpack
> library. Specifically, I would be interested in creating bindings for
> a language that runs on JVM such as Java and/or Scala. I have thought
> of a few possible ways to undergo this task. For instance, one could
> use Java Native Interface (JNI)  which enables Java code to call
> libraries written in C, C++, and assembly. However, it incurs
> considerable overhead and performance loss sometimes. To help, we
> could complement with SWIG tool. Alternatively, another option would
> be to use standard C to wrap the C++ functions, so that any objects
> created stays inside the native runtime and isn’t exposed to the JVM.
> This would allow for a simplified and robust C interface which could
> then be exposed to Java/Scala using JNR. JNA, together with Bridj is
> also another possibility that I believe could be worth exploring and
> evaluating. Therefore, my question is the following: how much precise
> and detailed should our proposal be. And to which degree is our plan
> “set in stone” or how much flexibility from our proposed plan do we
> have if chosen to continue the project? Thank you very much!

Hi Yasmine,

Thanks for getting in touch.  I agree that bindings to a JVM language
would be really nice, and JNI is likely the way to do this.  It sounds
like you have put a good amount of thought into how to create these
bindings, which is great.

When I wrote the Python bindings, I avoided the use of SWIG to provide
full information about the class to Python.  Instead, the Python
bindings operate by, essentially, simply holding pointers to mlpack
classes, without knowing what those pointers actually are.  This is also
important for Armadillo: we need to be able to pass a pointer to memory
held by a Java matrix object to C++ directly, so that Armadillo can
avoid copying the entire data matrix.

I would say, if you have a good idea of how to avoid copying the
Armadillo matrix via JNI or JNR or JNA, this would make for a strong
proposal, especially if you have a proof of concept or something.

Now I realize I have not actually answered the questions you asked, so I
will do that now...

 1. Ideally, the proposal should be detailed enough to make it clear to
    a mentor that you have a solid plan for accomplishing the work.
    Based on what you have written here, it sounds like you are on the
    right track, and when you write a proposal I would suggest focusing
    on a couple additional things: (1) the overall structure of the code
    and what components will be necessary; (2) how you will test the
    individual components; (3) what the interface will look like to a
    Java or Scala user; (4) a realistic timeline of the work.

 2. The plan is definitely not set in stone---sometimes, estimates are
    wrong, or ideas don't actually work out.  So with GSoC projects, it
    is always possible to revisit the goals and timeline if things are
    not going well.

I hope this is helpful---let me know if I can clarify anything.

Thanks,

Ryan

-- 
Ryan Curtin    | "Open the pig!"
r...@ratml.org |   - Frank Moses
_______________________________________________
mlpack mailing list
mlpack@lists.mlpack.org
http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack

Reply via email to