With Java API I mean a set of classes I can use from Java, I tried this with the current Scala API but wasn't very successful. Probably if you know a bit about Scala internals, you can figure it all out but this makes it kind of unpleasant to use. You don't necessarily need to write Java code to built a Java API, so you can also write Scala code and sticking to certain rules to make it callable from Java code without magic tricks.
So yeah, maybe we should just take a look at the Scala API, come up with a list of things that are difficult when used from Java code and see how it can be improved. That approach probably at least gives you the advantages mentioned here before, quick to do, no duplication, etc. Afterwards we could still work on an approach for Java which goes beyond "build a Scala API wrapper". If you look at the quick wins, maybe a good approach would just be to give the following advice to people who need to access MXNet from Java code: - Integrate MXNet with custom Scala code - Use a maven/gradle build to create a module of the integration which can be called from your Java code Jörn On Wed, Aug 16, 2017 at 10:02 PM, Nan Zhu <[email protected]> wrote: > Hi, Joern, > > when you say "Java API " it's sharing scala impl or not? > > Best, > > Nan > > On Wed, Aug 16, 2017 at 12:46 PM, Joern Kottmann <[email protected]> wrote: > >> Seems like we are all agree about the idea to add a Java API. >> >> Maybe it is just me, but it wouldn't at all make sense for me (OpenNLP >> use case) to use the Java API when it requires a Scala dependency, >> because at that point I would be better of just using the Scala API, >> and ensure that the things I build are compatible with Java. >> >> So if I don't want to add Scala as a dependency then I am better off >> building something on top of a generated JNI layer. As far as I can >> tell from my tests with the scala-package you can get quite far with >> MXNet using NDArray and the Symbol API. >> >> Maybe we could work on this from two sides as described by Pracheer. >> If we have a well defined Java API you could look at the work I have >> done by then and see how it can be plugged in or what can be learnt >> from it. >> >> Jörn >> >> On Wed, Aug 16, 2017 at 9:05 PM, Nan Zhu <[email protected]> wrote: >> > +1 for Sandeep's suggestion >> > >> > On Wed, Aug 16, 2017 at 11:21 AM, YiZhi Liu <[email protected]> wrote: >> > >> >> Agree with Sandeep, while I guess the performance won't change. But >> >> yes, benchmark talks. >> >> >> >> Moreover, in Scala package we use macros to generate operators >> >> automatically, which will require more efforts if we switch to pure >> >> Java. >> >> >> >> 2017-08-17 2:12 GMT+08:00 sandeep krishnamurthy < >> >> [email protected]>: >> >> > The fastest way to get Java binding is through building Java native >> >> > wrappers on Scala package. >> >> > Disadvantages would be: >> >> > * *Bloated library size: *May not be suitable for users planning to >> >> use >> >> > Java APIs in Android of such smaller systems. >> >> > * *Performance:* Performance may not be as good as building >> directly >> >> > over JNI and implementing ground up. For example, taking NDArray >> >> dimensions >> >> > as Java ArrayList then converting it to Scala Seq to adapt for Scala >> >> > NDArray API and more such adapters. >> >> > >> >> > However, building ground up from JNI would be a huge effort without >> >> > actually getting feedback from users early. >> >> > >> >> > *My Plan:* >> >> > 1. Build Java interface on top of Scala package. >> >> > 2. Get early feedback from users. It may turn out Java is not a great >> >> > candidate for DL training jobs. >> >> > 3. Solidify the interface (APIs) for Java users. >> >> > 4. Do performance benchmarks to see Scala Native / Java interface. >> This >> >> > gives us comparable numbers on performance in Java. >> >> > 5. Over a period of time replace underlying Scala usage with JNI base >> and >> >> > native Java implementation. Provided feedback from users is positive. >> >> > >> >> > Comments/Suggestion? >> >> > >> >> > Regards, >> >> > Sandeep >> >> > >> >> > >> >> > On Wed, Aug 16, 2017 at 10:56 AM, YiZhi Liu <[email protected]> >> wrote: >> >> > >> >> >> What Nan and I worried about is the re-implementation of something >> >> >> like https://github.com/apache/incubator-mxnet/blob/master/ >> >> >> scala-package/core/src/main/scala/ml/dmlc/mxnet/Model.scala#L246, >> >> >> and the executorManager, NDArray, KVStore ... it uses. >> >> >> >> >> >> the C API stays at the very low level. If this is the purpose, we can >> >> >> simply move ml.dmlc.mxnet.LibInfo to 'java' folder and compile >> without >> >> >> scala, no need to introduce JavaCPP. But I don't think this is what >> >> >> users want. >> >> >> >> >> >> 2017-08-17 1:41 GMT+08:00 Joern Kottmann <[email protected]>: >> >> >> > There will be a new scala version one day, and the story we had >> with >> >> >> > going from 2.10 to 2.11 might just repeat. In the end if you make a >> >> >> > dependency using scala you just end up making it for the currently >> >> >> > popular scala versions. And that might be ok for projects with >> >> >> > developers who are familiar with these issues, but it is not ok for >> >> >> > java projects, where people might not expect it or know about these >> >> >> > problems. It just makes it harder to use. >> >> >> > >> >> >> > To me it looks like that the C API is very stable and used by >> all/most >> >> >> > other APIs. If we have a Java API - accessing the C API via >> JavaCPP - >> >> >> > then we should end up with a pretty stable solution and a lot the >> code >> >> >> > that is duplicated with the Scala API is the generated code. >> >> >> > >> >> >> > I think we should explore this possible way of implementing it >> with a >> >> >> > proof-of-concept. >> >> >> > >> >> >> > And if we have a well made Java API it might be something which >> maybe >> >> >> > wouldn't need a lot of additions to be pleasurable to use from >> scala. >> >> >> > >> >> >> > Jörn >> >> >> > >> >> >> > On Wed, Aug 16, 2017 at 6:45 PM, Nan Zhu <[email protected]> >> >> wrote: >> >> >> >> I don't think there will be problems under "11", did the user see >> >> >> concrete >> >> >> >> errors? >> >> >> >> >> >> >> >> Best, >> >> >> >> >> >> >> >> Nan >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> On Wed, Aug 16, 2017 at 9:30 AM, YiZhi Liu <[email protected]> >> >> wrote: >> >> >> >> >> >> >> >>> Hi Nan, >> >> >> >>> >> >> >> >>> Users have 2.11, but with a different minor version, will it >> cause >> >> >> >>> conflicts? >> >> >> >>> >> >> >> >>> 2017-08-17 0:19 GMT+08:00 Nan Zhu <[email protected]>: >> >> >> >>> > Hi, Yizhi, >> >> >> >>> > >> >> >> >>> > You mean users have 2.10 env while we assemble 2.11 in it? >> >> >> >>> > >> >> >> >>> > Best, >> >> >> >>> > >> >> >> >>> > Nan >> >> >> >>> > >> >> >> >>> > On Wed, Aug 16, 2017 at 9:08 AM, YiZhi Liu < >> [email protected]> >> >> >> wrote: >> >> >> >>> > >> >> >> >>> >> Hi Joern, >> >> >> >>> >> >> >> >> >>> >> The point is that, the front is not a simple wrapper of >> c_api.h, >> >> as >> >> >> >>> >> you mentioned, which can be easily achieved by JavaCPP. >> >> >> >>> >> >> >> >> >>> >> I have noticed the potential conflicts between the assembled >> >> scala >> >> >> >>> >> library and the one in users' environment. Can we remove the >> >> scala >> >> >> >>> >> library from the assembly jar? @Nan It wouldn't be a problem >> >> since >> >> >> the >> >> >> >>> >> scala libraries with same major version are compatible. >> >> >> >>> >> >> >> >> >>> >> 2017-08-16 23:49 GMT+08:00 Joern Kottmann <[email protected] >> >: >> >> >> >>> >> > Hello, >> >> >> >>> >> > >> >> >> >>> >> > I personally had quite some issues with Scala dependencies >> in >> >> >> >>> >> > different versions and Spark, where one version is not >> >> compatible >> >> >> with >> >> >> >>> >> > the other version. Then you need to debug the dependency >> tree >> >> to >> >> >> find >> >> >> >>> >> > the places where the versions don't match. Every project >> which >> >> >> would >> >> >> >>> >> > like to use MXnet then has to depend on Scala and might also >> >> get >> >> >> >>> >> > conflicts if other dependencies depend on different Scala >> >> >> versions. >> >> >> >>> >> > Probably something which will cause issues for some of your >> >> users. >> >> >> >>> >> > Users who want to use Java might not be familiar with Scala >> >> >> dependency >> >> >> >>> >> > problems and have a hard time resolving them by getting >> strange >> >> >> error >> >> >> >>> >> > messages. >> >> >> >>> >> > >> >> >> >>> >> > The JNI layer could be generated with JavaCPP, then we would >> >> not >> >> >> need >> >> >> >>> >> > to write/maintain the C and the jvm side for that our self. >> >> >> >>> >> > A good example of JavaCPP and Scala usage is Apache Mahout >> [1]. >> >> >> >>> >> > >> >> >> >>> >> > Even if we don't use JavaCPP, the JNI layer should be easy >> to >> >> get >> >> >> into >> >> >> >>> >> > a state where both can share it, the current Scala JNI >> layers >> >> >> LibInfo >> >> >> >>> >> > classes could be converted to Java classes and would in most >> >> cases >> >> >> >>> >> > require only minor changes in the Scala code. >> >> >> >>> >> > >> >> >> >>> >> > Jörn >> >> >> >>> >> > >> >> >> >>> >> > [1] https://github.com/apache/mahout/tree/master/viennacl/ >> >> >> src/main >> >> >> >>> >> > >> >> >> >>> >> > On Wed, Aug 16, 2017 at 5:30 PM, Nan Zhu < >> >> [email protected]> >> >> >> >>> wrote: >> >> >> >>> >> >> I agree with Yizhi >> >> >> >>> >> >> >> >> >> >>> >> >> My major concern is the duplicate implementations, which >> are >> >> >> usually >> >> >> >>> >> one of >> >> >> >>> >> >> the major sources of bugs, especially with two languages >> which >> >> >> are >> >> >> >>> >> >> naturally interactive (OK, Calling Scala from Java might >> need >> >> >> some >> >> >> >>> more >> >> >> >>> >> >> efforts). It is just like we provide C++ & C APIs of MxNet >> in >> >> two >> >> >> >>> >> separated >> >> >> >>> >> >> packages. >> >> >> >>> >> >> >> >> >> >>> >> >> About dependency problem, when you say "As far as I see >> this >> >> has >> >> >> the >> >> >> >>> >> great >> >> >> >>> >> >> disadvantage that the Java API would force Scala as a >> >> dependency >> >> >> onto >> >> >> >>> >> the >> >> >> >>> >> >> java users.", would you please give a concrete example >> causing >> >> >> >>> critical >> >> >> >>> >> >> issues? >> >> >> >>> >> >> >> >> >> >>> >> >> Best, >> >> >> >>> >> >> >> >> >> >>> >> >> Nan >> >> >> >>> >> >> >> >> >> >>> >> >> >> >> >> >>> >> >> >> >> >> >>> >> >> On Wed, Aug 16, 2017 at 8:19 AM, YiZhi Liu < >> >> [email protected]> >> >> >> >>> wrote: >> >> >> >>> >> >> >> >> >> >>> >> >>> Hi, >> >> >> >>> >> >>> >> >> >> >>> >> >>> If we build the Java API from the very beginning, i.e. the >> >> JNI >> >> >> part, >> >> >> >>> >> >>> we have to rewrite the codes for training, predict, >> >> inferShape, >> >> >> etc. >> >> >> >>> >> >>> It would be too heavy to maintain a totally new front >> >> language. >> >> >> >>> >> >>> >> >> >> >>> >> >>> As far as I see, I don't think Scala library dependency >> would >> >> >> be a >> >> >> >>> big >> >> >> >>> >> >>> problem in most cases, unless we are going to use it in >> >> embedded >> >> >> >>> >> >>> devices. Could you illustrate some use-cases where you >> cannot >> >> >> >>> involve >> >> >> >>> >> >>> Scala dependencies? >> >> >> >>> >> >>> >> >> >> >>> >> >>> 2017-08-16 22:13 GMT+08:00 Joern Kottmann < >> >> [email protected]>: >> >> >> >>> >> >>> > Hello, >> >> >> >>> >> >>> > >> >> >> >>> >> >>> > the approach which is taken by Spark is described here >> [1]. >> >> >> >>> >> >>> > >> >> >> >>> >> >>> > As far as I see this has the great disadvantage that the >> >> Java >> >> >> API >> >> >> >>> >> >>> > would force Scala as a dependency onto the java users. >> >> >> >>> >> >>> > For a library it is always a great advantage if it >> doesn't >> >> >> have >> >> >> >>> many >> >> >> >>> >> >>> > dependencies, or zero dependencies. In our case it >> could be >> >> >> quite >> >> >> >>> >> >>> > realistic to have a thin wrapper around the C API >> without >> >> >> needing >> >> >> >>> any >> >> >> >>> >> >>> > other dependencies (or only dependencies which can't be >> >> >> avoided). >> >> >> >>> >> >>> > >> >> >> >>> >> >>> > The JNI layer could easily be shared between the Java >> and >> >> >> Scala >> >> >> >>> API. >> >> >> >>> >> >>> > As far as I understand is the JNI layer in the Scala API >> >> >> anyway >> >> >> >>> >> >>> > private and a change to it wouldn't require that the >> public >> >> >> part >> >> >> >>> of >> >> >> >>> >> >>> > the Scala API is changed. >> >> >> >>> >> >>> > >> >> >> >>> >> >>> > What do you think? >> >> >> >>> >> >>> > >> >> >> >>> >> >>> > Jörn >> >> >> >>> >> >>> > >> >> >> >>> >> >>> > [1] https://cwiki.apache.org/ >> >> confluence/display/SPARK/Java+ >> >> >> >>> >> API+Internals >> >> >> >>> >> >>> > >> >> >> >>> >> >>> > On Wed, Aug 16, 2017 at 3:39 PM, YiZhi Liu < >> >> >> [email protected]> >> >> >> >>> >> wrote: >> >> >> >>> >> >>> >> Hi Joern, >> >> >> >>> >> >>> >> >> >> >> >>> >> >>> >> I suggest to build Java API as a wrapper of Scala API, >> >> re-use >> >> >> >>> most >> >> >> >>> >> of >> >> >> >>> >> >>> >> the procedures. Referring to the Java API in Apache >> Spark. >> >> >> >>> >> >>> >> >> >> >> >>> >> >>> >> 2017-08-16 18:21 GMT+08:00 Joern Kottmann < >> >> [email protected] >> >> >> >: >> >> >> >>> >> >>> >>> Hello all, >> >> >> >>> >> >>> >>> >> >> >> >>> >> >>> >>> I would like to propose the addition of a Java API to >> >> MXNet. >> >> >> >>> >> >>> >>> >> >> >> >>> >> >>> >>> There has been some previous work done for the Scala >> API, >> >> >> and it >> >> >> >>> >> makes >> >> >> >>> >> >>> >>> sense to at least share the JNI layer between the two. >> >> >> >>> >> >>> >>> >> >> >> >>> >> >>> >>> The Java API probably should be aligned with the >> Python >> >> API >> >> >> >>> (and >> >> >> >>> >> >>> >>> others which exist already) with a few changes to give >> >> it a >> >> >> >>> native >> >> >> >>> >> >>> >>> Java feel. >> >> >> >>> >> >>> >>> >> >> >> >>> >> >>> >>> As far as I understand there are multiple people >> >> interested >> >> >> to >> >> >> >>> >> work on >> >> >> >>> >> >>> >>> this and it would be good to maybe come up with a >> written >> >> >> >>> proposal >> >> >> >>> >> on >> >> >> >>> >> >>> >>> how things should be. >> >> >> >>> >> >>> >>> >> >> >> >>> >> >>> >>> My motivation is to get a Java API which can be used >> by >> >> >> Apache >> >> >> >>> >> OpenNLP >> >> >> >>> >> >>> >>> to solve various NLP tasks using Deep Learning based >> >> >> approaches >> >> >> >>> >> and I >> >> >> >>> >> >>> >>> am also interested to work on MXNet. >> >> >> >>> >> >>> >>> >> >> >> >>> >> >>> >>> Jörn >> >> >> >>> >> >>> >> >> >> >> >>> >> >>> >> >> >> >> >>> >> >>> >> >> >> >> >>> >> >>> >> -- >> >> >> >>> >> >>> >> Yizhi Liu >> >> >> >>> >> >>> >> DMLC member >> >> >> >>> >> >>> >> Technical Manager >> >> >> >>> >> >>> >> Qihoo 360 Inc, Shanghai, China >> >> >> >>> >> >>> >> >> >> >>> >> >>> >> >> >> >>> >> >>> >> >> >> >>> >> >>> -- >> >> >> >>> >> >>> Yizhi Liu >> >> >> >>> >> >>> DMLC member >> >> >> >>> >> >>> Technical Manager >> >> >> >>> >> >>> Qihoo 360 Inc, Shanghai, China >> >> >> >>> >> >>> >> >> >> >>> >> >> >> >> >>> >> >> >> >> >>> >> >> >> >> >>> >> -- >> >> >> >>> >> Yizhi Liu >> >> >> >>> >> DMLC member >> >> >> >>> >> Technical Manager >> >> >> >>> >> Qihoo 360 Inc, Shanghai, China >> >> >> >>> >> >> >> >> >>> >> >> >> >>> >> >> >> >>> >> >> >> >>> -- >> >> >> >>> Yizhi Liu >> >> >> >>> DMLC member >> >> >> >>> Technical Manager >> >> >> >>> Qihoo 360 Inc, Shanghai, China >> >> >> >>> >> >> >> >> >> >> >> >> >> >> >> >> -- >> >> >> Yizhi Liu >> >> >> DMLC member >> >> >> Technical Manager >> >> >> Qihoo 360 Inc, Shanghai, China >> >> >> >> >> > >> >> > >> >> > >> >> > -- >> >> > Sandeep Krishnamurthy >> >> >> >> >> >> >> >> -- >> >> Yizhi Liu >> >> DMLC member >> >> Technical Manager >> >> Qihoo 360 Inc, Shanghai, China >> >> >>
