It would be great if there could be some examples for the benefits of the proposed API. I understand that certain syntax may look unsatisfying when calling scala API from java, and the questions that should be asked are: 1. Whether it’s intolerable to the extent that a new language binding must be added. 2. Is this something that can be fixed through simpler methods. It wouldn’t be good investment of time if the gain is only marginal.
Best regards, -sz On 8/16/17, 12:46 PM, "Joern Kottmann" <[email protected]> wrote: Seems like we are all agree about the idea to add a Java API. Maybe it is just me, but it wouldn't at all make sense for me (OpenNLP use case) to use the Java API when it requires a Scala dependency, because at that point I would be better of just using the Scala API, and ensure that the things I build are compatible with Java. So if I don't want to add Scala as a dependency then I am better off building something on top of a generated JNI layer. As far as I can tell from my tests with the scala-package you can get quite far with MXNet using NDArray and the Symbol API. Maybe we could work on this from two sides as described by Pracheer. If we have a well defined Java API you could look at the work I have done by then and see how it can be plugged in or what can be learnt from it. Jörn On Wed, Aug 16, 2017 at 9:05 PM, Nan Zhu <[email protected]> wrote: > +1 for Sandeep's suggestion > > On Wed, Aug 16, 2017 at 11:21 AM, YiZhi Liu <[email protected]> wrote: > >> Agree with Sandeep, while I guess the performance won't change. But >> yes, benchmark talks. >> >> Moreover, in Scala package we use macros to generate operators >> automatically, which will require more efforts if we switch to pure >> Java. >> >> 2017-08-17 2:12 GMT+08:00 sandeep krishnamurthy < >> [email protected]>: >> > The fastest way to get Java binding is through building Java native >> > wrappers on Scala package. >> > Disadvantages would be: >> > * *Bloated library size: *May not be suitable for users planning to >> use >> > Java APIs in Android of such smaller systems. >> > * *Performance:* Performance may not be as good as building directly >> > over JNI and implementing ground up. For example, taking NDArray >> dimensions >> > as Java ArrayList then converting it to Scala Seq to adapt for Scala >> > NDArray API and more such adapters. >> > >> > However, building ground up from JNI would be a huge effort without >> > actually getting feedback from users early. >> > >> > *My Plan:* >> > 1. Build Java interface on top of Scala package. >> > 2. Get early feedback from users. It may turn out Java is not a great >> > candidate for DL training jobs. >> > 3. Solidify the interface (APIs) for Java users. >> > 4. Do performance benchmarks to see Scala Native / Java interface. This >> > gives us comparable numbers on performance in Java. >> > 5. Over a period of time replace underlying Scala usage with JNI base and >> > native Java implementation. Provided feedback from users is positive. >> > >> > Comments/Suggestion? >> > >> > Regards, >> > Sandeep >> > >> > >> > On Wed, Aug 16, 2017 at 10:56 AM, YiZhi Liu <[email protected]> wrote: >> > >> >> What Nan and I worried about is the re-implementation of something >> >> like https://github.com/apache/incubator-mxnet/blob/master/ >> >> scala-package/core/src/main/scala/ml/dmlc/mxnet/Model.scala#L246, >> >> and the executorManager, NDArray, KVStore ... it uses. >> >> >> >> the C API stays at the very low level. If this is the purpose, we can >> >> simply move ml.dmlc.mxnet.LibInfo to 'java' folder and compile without >> >> scala, no need to introduce JavaCPP. But I don't think this is what >> >> users want. >> >> >> >> 2017-08-17 1:41 GMT+08:00 Joern Kottmann <[email protected]>: >> >> > There will be a new scala version one day, and the story we had with >> >> > going from 2.10 to 2.11 might just repeat. In the end if you make a >> >> > dependency using scala you just end up making it for the currently >> >> > popular scala versions. And that might be ok for projects with >> >> > developers who are familiar with these issues, but it is not ok for >> >> > java projects, where people might not expect it or know about these >> >> > problems. It just makes it harder to use. >> >> > >> >> > To me it looks like that the C API is very stable and used by all/most >> >> > other APIs. If we have a Java API - accessing the C API via JavaCPP - >> >> > then we should end up with a pretty stable solution and a lot the code >> >> > that is duplicated with the Scala API is the generated code. >> >> > >> >> > I think we should explore this possible way of implementing it with a >> >> > proof-of-concept. >> >> > >> >> > And if we have a well made Java API it might be something which maybe >> >> > wouldn't need a lot of additions to be pleasurable to use from scala. >> >> > >> >> > Jörn >> >> > >> >> > On Wed, Aug 16, 2017 at 6:45 PM, Nan Zhu <[email protected]> >> wrote: >> >> >> I don't think there will be problems under "11", did the user see >> >> concrete >> >> >> errors? >> >> >> >> >> >> Best, >> >> >> >> >> >> Nan >> >> >> >> >> >> >> >> >> >> >> >> On Wed, Aug 16, 2017 at 9:30 AM, YiZhi Liu <[email protected]> >> wrote: >> >> >> >> >> >>> Hi Nan, >> >> >>> >> >> >>> Users have 2.11, but with a different minor version, will it cause >> >> >>> conflicts? >> >> >>> >> >> >>> 2017-08-17 0:19 GMT+08:00 Nan Zhu <[email protected]>: >> >> >>> > Hi, Yizhi, >> >> >>> > >> >> >>> > You mean users have 2.10 env while we assemble 2.11 in it? >> >> >>> > >> >> >>> > Best, >> >> >>> > >> >> >>> > Nan >> >> >>> > >> >> >>> > On Wed, Aug 16, 2017 at 9:08 AM, YiZhi Liu <[email protected]> >> >> wrote: >> >> >>> > >> >> >>> >> Hi Joern, >> >> >>> >> >> >> >>> >> The point is that, the front is not a simple wrapper of c_api.h, >> as >> >> >>> >> you mentioned, which can be easily achieved by JavaCPP. >> >> >>> >> >> >> >>> >> I have noticed the potential conflicts between the assembled >> scala >> >> >>> >> library and the one in users' environment. Can we remove the >> scala >> >> >>> >> library from the assembly jar? @Nan It wouldn't be a problem >> since >> >> the >> >> >>> >> scala libraries with same major version are compatible. >> >> >>> >> >> >> >>> >> 2017-08-16 23:49 GMT+08:00 Joern Kottmann <[email protected]>: >> >> >>> >> > Hello, >> >> >>> >> > >> >> >>> >> > I personally had quite some issues with Scala dependencies in >> >> >>> >> > different versions and Spark, where one version is not >> compatible >> >> with >> >> >>> >> > the other version. Then you need to debug the dependency tree >> to >> >> find >> >> >>> >> > the places where the versions don't match. Every project which >> >> would >> >> >>> >> > like to use MXnet then has to depend on Scala and might also >> get >> >> >>> >> > conflicts if other dependencies depend on different Scala >> >> versions. >> >> >>> >> > Probably something which will cause issues for some of your >> users. >> >> >>> >> > Users who want to use Java might not be familiar with Scala >> >> dependency >> >> >>> >> > problems and have a hard time resolving them by getting strange >> >> error >> >> >>> >> > messages. >> >> >>> >> > >> >> >>> >> > The JNI layer could be generated with JavaCPP, then we would >> not >> >> need >> >> >>> >> > to write/maintain the C and the jvm side for that our self. >> >> >>> >> > A good example of JavaCPP and Scala usage is Apache Mahout [1]. >> >> >>> >> > >> >> >>> >> > Even if we don't use JavaCPP, the JNI layer should be easy to >> get >> >> into >> >> >>> >> > a state where both can share it, the current Scala JNI layers >> >> LibInfo >> >> >>> >> > classes could be converted to Java classes and would in most >> cases >> >> >>> >> > require only minor changes in the Scala code. >> >> >>> >> > >> >> >>> >> > Jörn >> >> >>> >> > >> >> >>> >> > [1] https://github.com/apache/mahout/tree/master/viennacl/ >> >> src/main >> >> >>> >> > >> >> >>> >> > On Wed, Aug 16, 2017 at 5:30 PM, Nan Zhu < >> [email protected]> >> >> >>> wrote: >> >> >>> >> >> I agree with Yizhi >> >> >>> >> >> >> >> >>> >> >> My major concern is the duplicate implementations, which are >> >> usually >> >> >>> >> one of >> >> >>> >> >> the major sources of bugs, especially with two languages which >> >> are >> >> >>> >> >> naturally interactive (OK, Calling Scala from Java might need >> >> some >> >> >>> more >> >> >>> >> >> efforts). It is just like we provide C++ & C APIs of MxNet in >> two >> >> >>> >> separated >> >> >>> >> >> packages. >> >> >>> >> >> >> >> >>> >> >> About dependency problem, when you say "As far as I see this >> has >> >> the >> >> >>> >> great >> >> >>> >> >> disadvantage that the Java API would force Scala as a >> dependency >> >> onto >> >> >>> >> the >> >> >>> >> >> java users.", would you please give a concrete example causing >> >> >>> critical >> >> >>> >> >> issues? >> >> >>> >> >> >> >> >>> >> >> Best, >> >> >>> >> >> >> >> >>> >> >> Nan >> >> >>> >> >> >> >> >>> >> >> >> >> >>> >> >> >> >> >>> >> >> On Wed, Aug 16, 2017 at 8:19 AM, YiZhi Liu < >> [email protected]> >> >> >>> wrote: >> >> >>> >> >> >> >> >>> >> >>> Hi, >> >> >>> >> >>> >> >> >>> >> >>> If we build the Java API from the very beginning, i.e. the >> JNI >> >> part, >> >> >>> >> >>> we have to rewrite the codes for training, predict, >> inferShape, >> >> etc. >> >> >>> >> >>> It would be too heavy to maintain a totally new front >> language. >> >> >>> >> >>> >> >> >>> >> >>> As far as I see, I don't think Scala library dependency would >> >> be a >> >> >>> big >> >> >>> >> >>> problem in most cases, unless we are going to use it in >> embedded >> >> >>> >> >>> devices. Could you illustrate some use-cases where you cannot >> >> >>> involve >> >> >>> >> >>> Scala dependencies? >> >> >>> >> >>> >> >> >>> >> >>> 2017-08-16 22:13 GMT+08:00 Joern Kottmann < >> [email protected]>: >> >> >>> >> >>> > Hello, >> >> >>> >> >>> > >> >> >>> >> >>> > the approach which is taken by Spark is described here [1]. >> >> >>> >> >>> > >> >> >>> >> >>> > As far as I see this has the great disadvantage that the >> Java >> >> API >> >> >>> >> >>> > would force Scala as a dependency onto the java users. >> >> >>> >> >>> > For a library it is always a great advantage if it doesn't >> >> have >> >> >>> many >> >> >>> >> >>> > dependencies, or zero dependencies. In our case it could be >> >> quite >> >> >>> >> >>> > realistic to have a thin wrapper around the C API without >> >> needing >> >> >>> any >> >> >>> >> >>> > other dependencies (or only dependencies which can't be >> >> avoided). >> >> >>> >> >>> > >> >> >>> >> >>> > The JNI layer could easily be shared between the Java and >> >> Scala >> >> >>> API. >> >> >>> >> >>> > As far as I understand is the JNI layer in the Scala API >> >> anyway >> >> >>> >> >>> > private and a change to it wouldn't require that the public >> >> part >> >> >>> of >> >> >>> >> >>> > the Scala API is changed. >> >> >>> >> >>> > >> >> >>> >> >>> > What do you think? >> >> >>> >> >>> > >> >> >>> >> >>> > Jörn >> >> >>> >> >>> > >> >> >>> >> >>> > [1] https://cwiki.apache.org/ >> confluence/display/SPARK/Java+ >> >> >>> >> API+Internals >> >> >>> >> >>> > >> >> >>> >> >>> > On Wed, Aug 16, 2017 at 3:39 PM, YiZhi Liu < >> >> [email protected]> >> >> >>> >> wrote: >> >> >>> >> >>> >> Hi Joern, >> >> >>> >> >>> >> >> >> >>> >> >>> >> I suggest to build Java API as a wrapper of Scala API, >> re-use >> >> >>> most >> >> >>> >> of >> >> >>> >> >>> >> the procedures. Referring to the Java API in Apache Spark. >> >> >>> >> >>> >> >> >> >>> >> >>> >> 2017-08-16 18:21 GMT+08:00 Joern Kottmann < >> [email protected] >> >> >: >> >> >>> >> >>> >>> Hello all, >> >> >>> >> >>> >>> >> >> >>> >> >>> >>> I would like to propose the addition of a Java API to >> MXNet. >> >> >>> >> >>> >>> >> >> >>> >> >>> >>> There has been some previous work done for the Scala API, >> >> and it >> >> >>> >> makes >> >> >>> >> >>> >>> sense to at least share the JNI layer between the two. >> >> >>> >> >>> >>> >> >> >>> >> >>> >>> The Java API probably should be aligned with the Python >> API >> >> >>> (and >> >> >>> >> >>> >>> others which exist already) with a few changes to give >> it a >> >> >>> native >> >> >>> >> >>> >>> Java feel. >> >> >>> >> >>> >>> >> >> >>> >> >>> >>> As far as I understand there are multiple people >> interested >> >> to >> >> >>> >> work on >> >> >>> >> >>> >>> this and it would be good to maybe come up with a written >> >> >>> proposal >> >> >>> >> on >> >> >>> >> >>> >>> how things should be. >> >> >>> >> >>> >>> >> >> >>> >> >>> >>> My motivation is to get a Java API which can be used by >> >> Apache >> >> >>> >> OpenNLP >> >> >>> >> >>> >>> to solve various NLP tasks using Deep Learning based >> >> approaches >> >> >>> >> and I >> >> >>> >> >>> >>> am also interested to work on MXNet. >> >> >>> >> >>> >>> >> >> >>> >> >>> >>> Jörn >> >> >>> >> >>> >> >> >> >>> >> >>> >> >> >> >>> >> >>> >> >> >> >>> >> >>> >> -- >> >> >>> >> >>> >> Yizhi Liu >> >> >>> >> >>> >> DMLC member >> >> >>> >> >>> >> Technical Manager >> >> >>> >> >>> >> Qihoo 360 Inc, Shanghai, China >> >> >>> >> >>> >> >> >>> >> >>> >> >> >>> >> >>> >> >> >>> >> >>> -- >> >> >>> >> >>> Yizhi Liu >> >> >>> >> >>> DMLC member >> >> >>> >> >>> Technical Manager >> >> >>> >> >>> Qihoo 360 Inc, Shanghai, China >> >> >>> >> >>> >> >> >>> >> >> >> >>> >> >> >> >>> >> >> >> >>> >> -- >> >> >>> >> Yizhi Liu >> >> >>> >> DMLC member >> >> >>> >> Technical Manager >> >> >>> >> Qihoo 360 Inc, Shanghai, China >> >> >>> >> >> >> >>> >> >> >>> >> >> >>> >> >> >>> -- >> >> >>> Yizhi Liu >> >> >>> DMLC member >> >> >>> Technical Manager >> >> >>> Qihoo 360 Inc, Shanghai, China >> >> >>> >> >> >> >> >> >> >> >> -- >> >> Yizhi Liu >> >> DMLC member >> >> Technical Manager >> >> Qihoo 360 Inc, Shanghai, China >> >> >> > >> > >> > >> > -- >> > Sandeep Krishnamurthy >> >> >> >> -- >> Yizhi Liu >> DMLC member >> Technical Manager >> Qihoo 360 Inc, Shanghai, China >>
